dariober / SICERpy

Python wrapper around the popular ChIP-Seq peak caller SICER
15 stars 3 forks source link

Couldn't add hg38 genome #4

Closed abalter closed 7 years ago

abalter commented 7 years ago

I followed the instructions from the SICER Readme section 4.4 to add a genome, but still got an error about not finding the genome.

Attempt 1

$ python $chipseq/SICER.py -c /<path>bam/mock_input_A.bam -t /<path>/bam/mock_BRD4_A.bam --species hg38 -rt 0 > mock_a_peaks.bed 2> sicermocka.log
balter@exalab3:/home/exacloud/lustre1/CompBio/projs/alumkal_160630$ cat sicermocka.log
usage: SICER.py [-h] [--treatment TREATMENT] --control CONTROL
                [--effGenomeSize EFFGENOMESIZE] [--requiredFlag REQUIREDFLAG]
                [--filterFlag FILTERFLAG] [--mapq MAPQ]
                [--redThresh REDTHRESH] [--windowSize WINDOWSIZE]
                [--gapSize GAPSIZE] [--fragSize FRAGSIZE] [--keeptmp]
                [--version]
SICER.py: error: unrecognized arguments: --species hg38

Attempt2

$ python $chipseq/SICER.py -c /<path>/bam/mock_input_A.bam -t /<path>/bam/mock_BRD4_A.bam -s hg38 -rt 0 > mock_a_peaks.bed 2> sicermocka.log
$ cat sicermocka.log
usage: SICER.py [-h] [--treatment TREATMENT] --control CONTROL
                [--effGenomeSize EFFGENOMESIZE] [--requiredFlag REQUIREDFLAG]
                [--filterFlag FILTERFLAG] [--mapq MAPQ]
                [--redThresh REDTHRESH] [--windowSize WINDOWSIZE]
                [--gapSize GAPSIZE] [--fragSize FRAGSIZE] [--keeptmp]
                [--version]
SICER.py: error: unrecognized arguments: -s hg38

In GenomeData.py:

hg38_chroms = ['chr1','chr2','chr3','chr4','chr5','chr6','chr7','chr8','chr9',
         'chr10','chr11','chr12','chr13','chr14','chr15','chr16','chr17',
         'chr18','chr19','chr20','chr21','chr22','chrX','chrY','chrM']

hg38_chrom_lengths = {'chr1':248956422, 'chr2':242193529, 'chr3':198295559,
            'chr4':190214555, 'chr5':181538259, 'chr6':170805979,
            'chr7':159345973, 'chr8':145138636, 'chr9':138394717,
            'chr10':133797422, 'chr11':135086622, 'chr12':133275309,
            'chr13':114364328, 'chr14':107043718, 'chr15':101991189,
            'chr16':90338345, 'chr17':83257441, 'chr18':80373285,
            'chr19':58617616, 'chr20':64444167, 'chr21':46709983,
            'chr22':50818468, 'chrX':156040895, 'chrY':57227415,
            'chrM':16569}

species_chroms = {'mm8':mm8_chroms, 
            'mm9':mm9_chroms, 
            'hg18':hg18_chroms,
            'hg19':hg19_chroms,
            'hg38':hg38_chroms,
...
...
...

species_chrom_lengths={'mm8':mm8_chrom_lengths,
               'mm9':mm9_chrom_lengths,
               'hg18':hg18_chrom_lengths,
               'hg19':hg19_chrom_lengths,
               'hg38':hg38_chrom_lengths,
...
...
...
dariober commented 7 years ago

Hi- thanks for exploring SICERpy. With respect to your issues, most important is that SICERpy is a bit of a stub. I started working on it when I discovered there was a nice project, epic, aiming at the same purpose (making sicer more friendly) but much more developed. As noted on top of the README file I would suggest switching to epic. See also this post on Biostars.

To address specifically this issue, I removed the -s/--species option since it's redundant. The genome file is inferred from the header of the input bam files. I noticed the README file has an example still using it and I'm going to correct it. Apologies.

abalter commented 7 years ago

Thanks! That is where I got using that switch.