biocore-ntnu / epic

(DEPRECATED) epic: diffuse domain ChIP-Seq caller based on SICER
http://bioepic.readthedocs.io
MIT License
31 stars 6 forks source link

epic 0.2.9 compute_score_threshold.py:24 RuntimeWarning: divide by zero encountered in log #74

Open grendon opened 6 years ago

grendon commented 6 years ago

I converted my BAM files to BED format with bamtobed as suggested in the tutorial.

Then I run the epic command like this:

epic --treatment Amy_C2_763_C1_IP.bed \ --control Amy_C2_763_C1_input.bed \ --genome rn6 \ --window-size 200 \ --gaps-allowed 3 \ --number-cores 4 \ --bed Amy_C2_763_C1.W200.G3.epicCalls.bed \ --bigwig Amy_C2_763_C1.W200.G3 \ --outfile Amy_C2_763_C1.W200.G3.epicCalls.csv

The command aborted. These are the last few lines with the error message

0 total chip count (File: compute_background_probabilites, Log level: DEBUG, Time: Tue, 20 Mar 2018 14:06:56 ) 0.0 average_window_readcount (File: compute_background_probabilites, Log level:DEBUG, Time: Tue, 20 Mar 2018 14:06:56 ) 1 island_enriched_threshold (File: compute_background_probabilites, Log level: DEBUG, Time: Tue, 20 Mar 2018 14:06:56 ) 4.0 gap_contribution (File: compute_background_probabilites, Log level: DEBUG, Time: Tue, 20 Mar 2018 14:06:56 ) 1.0 boundary_contribution (File: compute_background_probabilites, Log level: DEBUG, Time: Tue, 20 Mar 2018 14:06:56 ) /home/apps/software/epic/0.2.9-IGB-gcc-4.9.4-Python-2.7.13/lib/python2.7/site-packages/bioepic-0.2.9-py2.7.egg/epic/statistics/compute_score_threshold.py:24: RuntimeWarning: divide by zero encountered in log Traceback (most recent call last): File "/home/apps/software/epic/0.2.9-IGB-gcc-4.9.4-Python-2.7.13/bin/epic", line 4, in import('pkg_resources').run_script('bioepic==0.2.9', 'epic') File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 738, in run_script File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 1506, in run_script File "/home/apps/software/epic/0.2.9-IGB-gcc-4.9.4-Python-2.7.13/lib/python2.7/site-packages/bioepic-0.2.9-py2.7.egg/EGG-INFO/scripts/epic", line 285, in

File "build/bdist.linux-x86_64/egg/epic/run/run_epic.py", line 54, in run_epic File "build/bdist.linux-x86_64/egg/epic/statistics/compute_background_probabilites.py", line 51, in compute_background_probabilities File "build/bdist.linux-x86_64/egg/epic/statistics/compute_score_threshold.py", line 26, in compute_score_threshold OverflowError: cannot convert float infinity to integer

A snippet of one of the BED files looks like this:

NC_001665.2 16221 16313 K00363:111:HMW3HBBXX:7:2212:31913:7134 11 + NC_001665.2 16222 16313 K00363:111:HMW3HBBXX:2:2118:13098:31189 11 + NC_001665.2 16222 16313 K00363:111:HMW3HBBXX:4:1109:15747:9016 11 - NC_001665.2 16222 16313 K00363:111:HMW3HBBXX:5:2120:7476:19795 11 - NC_001665.2 16223 16313 K00363:111:HMW3HBBXX:2:1205:32096:24894 11 - NC_001665.2 16223 16313 K00363:111:HMW3HBBXX:5:1107:14631:29624 11 - NC_001665.2 16223 16313 K00363:111:HMW3HBBXX:8:2109:21379:42653 11 + NC_001665.2 16224 16313 K00363:111:HMW3HBBXX:4:2114:8816:15399 11 - NC_001665.2 16225 16313 K00363:111:HMW3HBBXX:7:1118:30442:5886 11 + NC_001665.2 16226 16313 K00363:111:HMW3HBBXX:3:1215:17168:13060 11 +

Any suggestions to correct this error??

endrebak commented 6 years ago

Thanks for reporting. I should find a more informative error message. The problem is that no reads are found on any chromosomes because your bed-file uses non-standard names.

If this isn't a regular genome, look into the chromsizes option :)

endrebak commented 6 years ago

Forgot to add:

  --chromsizes CHROMSIZES, -cs CHROMSIZES
                        Set the chromosome lengths yourself in a file with two
                        columns: chromosome names and sizes. Useful to analyze
                        custom genomes, assemblies or simulated data. Only
                        chromosomes included in the file will be analyzed.

Please tell me if this does not solve your problem :)

grendon commented 6 years ago

That was a quick reply!

My genome is rat and it is part of your list. But if the tool doesn't like my naming schema, it can't be helped! I will generate the file of chromosome sizes and try again.

Thanks a lot!

endrebak commented 6 years ago

But rn5, rn6 etc. have chromosome names chr1-chr20 + chrX + chrY.

Is NC_001665.2 actually chr2or what?

grendon commented 6 years ago

It is the NCBI rat genome Rnor_6.0

NC_001665.2 Rattus norvegicus strain BN/SsNHsdMCW mitochondrion, complete genome

endrebak commented 6 years ago

Then your bed-files do not contain chromosome info, as far as I understand. Also, there seems to be many duplicated reads (PCR artifacts?).