biocore-ntnu / epic2

Ultraperformant reimplementation of SICER
https://doi.org/10.1093/bioinformatics/btz232
MIT License
56 stars 9 forks source link

OverflowError reads_to_bins #36

Closed friederhadlich closed 4 years ago

friederhadlich commented 4 years ago

Hi Endre,

I sequentially used this call epic2 -bin 200 -g 3 -fdr 0.05 -t $TREATMENT -c $CONTROL -cs ../4_epic2/Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizesand epic2 stops with an error message for some of my treatments.

Do you have an idea of how to continue?

This is the typical error (console output):

Found a median readlength of 50.0

Using chromosome sizes found in ../4_epic2/Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes.

Using an effective genome length of ~2172 * 1e6

Parsing ChIP file(s): H3K4me1_MACT_P60_E.sorted.bam Traceback (most recent call last): File "/home/hadlich/miniconda/bin/epic2", line 248, in _main(args) File "/home/hadlich/miniconda/lib/python3.7/site-packages/epic2/main.py", line 40, in _main args["treatment"], args, "ChIP") File "epic2/src/reads_to_bins.pyx", line 183, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 309, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 61, in epic2.src.reads_to_bins.remove_out_of_bounds_bins OverflowError: Python int too large to convert to C long

I checked the chromosome sizes and they are all fine! Maybe the BWA mappings ahead cause the error ... but why this error occurs only for a few treatments?

Hope you can help ... or give an idea to continue.

Greetings from Germany,

Frieder

endrebak commented 4 years ago

Thanks for creating an issue. Do you have the possibility to share the files with me?

endrebak commented 4 years ago

The error is here:

https://github.com/biocore-ntnu/epic2/blob/master/epic2/src/reads_to_bins.pyx#L61

The error message likely means that either one of your chromsizes or one of the positions in the alignment files are above the uint32 max... Or that they were a negative number and got misinterpreted as a large int.

If you cannot share the files I suggest you go through the starts and ends in your alignment files and see what the max and min value of all the read positions are. Sometimes nonaligned reads are given a start and end of -1 for example.

friederhadlich commented 4 years ago

I used Bos_taurus.ARS-UCD1.2.dna.toplevel.fa as genomic reference for the BWA mapping. Real data cannot be uploaded ... I will check the mapping results for erroneous positions and let you know. Thanks a lot!

endrebak commented 4 years ago

Try converting to bed files with bedtools and see if it works then? Good luck and feel free to ping me for anything. Without the files it is impossible for me to debug as I am sure you understand :)

On Mon, Jul 13, 2020 at 2:51 PM friederhadlich notifications@github.com wrote:

I used Bos_taurus.ARS-UCD1.2.dna.toplevel.fa as genomic reference for the BWA mapping. Real data cannot be uploaded ... I will check the mapping results for erroneous positions and let you know. Thanks a lot!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore-ntnu/epic2/issues/36#issuecomment-657541892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHURUXFBZS5Z56PIDPDQXLR3L7OZANCNFSM4OYOFEEQ .

friederhadlich commented 4 years ago

Unfortunately, bed files aren't the solution for me! I tried control and treatment inputs with bam and bed format without any effect.

The problem must be in parameter "fs" ... please see these calls: epic2 -t ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam -bin 200 -g 3 -fdr 0.05 -fs 300 -egf 0.8 -c ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > H3K27Ac_MACT_p59_E.txt

Found a median readlength of 50.0

Using chromosome sizes found in Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes.

Using an effective genome length of ~2172 * 1e6

Parsing ChIP file(s): ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam Traceback (most recent call last): File "/home/hadlich/miniconda/bin/epic2", line 248, in _main(args) File "/home/hadlich/miniconda/lib/python3.7/site-packages/epic2/main.py", line 40, in _main args["treatment"], args, "ChIP") File "epic2/src/reads_to_bins.pyx", line 183, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 309, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 61, in epic2.src.reads_to_bins.remove_out_of_bounds_bins OverflowError: Python int too large to convert to C long

epic2 -t ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam -bin 200 -g 3 -fdr 0.05 -egf 0.8 -c ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > H3K27Ac_MACT_p59_E.txt

Found a median readlength of 50.0

Using chromosome sizes found in Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes.

Using an effective genome length of ~2172 * 1e6

Parsing ChIP file(s): ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam

Valid ChIP reads: 11245276 (11245276 before out of bounds removal)

Score threshold: 19.652

Number of tags in a window: 3

Number of islands found: 39544

Parsing Input file(s): ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed

Valid Background reads: 23502103 (23502128 before out of bounds removal)

friederhadlich commented 4 years ago

Chromosome positions of the mapping file are fine for treatment and control, as you can see here for control:

cat ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed | cut -f 3 | awk 'BEGIN{a=10000;b=0};{if ($0<a) a=$0; if ($0>b) b=$0}; END{print a,b;}'

37 158533984

cat ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed | cut -f 2 | awk 'BEGIN{a=10000;b=0};{if ($0<a) a=$0; if ($0>b) b=$0}; END{print a,b;}'

0 158533934

friederhadlich commented 4 years ago

One more running example: epic2 -t ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam -bin 500 -g 3 -fdr 0.05 -fs 300 -egf 0.8 -c ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > H3K27Ac_MACT_p59_E.txt

endrebak commented 4 years ago

I will look at it in a few weeks. Taking a vacation now :)

Thanks for providing more details :)

On Wednesday, July 15, 2020, friederhadlich notifications@github.com wrote:

One more running example: epic2 -t ../3_bwa/H3K27Ac_MACT_p59_E.sorted.bam -bin 500 -g 3 -fdr 0.05 -fs 300 -egf 0.8 -c ../3_bwa/INPUT_pool_MACT_E.sorted.bam.bed -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > H3K27Ac_MACT_p59_E.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore-ntnu/epic2/issues/36#issuecomment-658719246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHURURJZXZ7WV7FYTLFOQTR3WIYPANCNFSM4OYOFEEQ .

millerh1 commented 4 years ago

I am also seeing this error -- but I figured out a temporary workaround. I am using hg38 chrom.sizes from UCSC link and a paired-end fastq file which I aligned with bwa and converted to a bed file.

Here is the epic2 command:

epic2 -t RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed -e 100 -fdr .01 --effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed

And the output:

Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2167 * 1e6

Parsing ChIP file(s):
  RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed
Traceback (most recent call last):
  File "/home/UTHSCSA/millerh1/miniconda3/envs/RSeqEnv/bin/epic2", line 248, in <module>
    _main(args)
  File "/home/UTHSCSA/millerh1/miniconda3/envs/RSeqEnv/lib/python3.7/site-packages/epic2/main.py", line 40, in _main
    args["treatment"], args, "ChIP")
  File "epic2/src/reads_to_bins.pyx", line 183, in epic2.src.reads_to_bins.files_to_bin_counts
  File "epic2/src/reads_to_bins.pyx", line 309, in epic2.src.reads_to_bins.files_to_bin_counts
  File "epic2/src/reads_to_bins.pyx", line 61, in epic2.src.reads_to_bins.remove_out_of_bounds_bins
OverflowError: Python int too large to convert to C long

The error goes away when I remove all the non-standard chromosomes from the chrom.sizes file, though I don't understand why. Regardless most people will have these non-standard chromosomes so it would be great if using them was supported.

Here's chrom.sizes modified by me:

chr1    248956422
chr2    242193529
chr3    198295559
chr4    190214555
chr5    181538259
chr6    170805979
chr7    159345973
chrX    156040895
chr8    145138636
chr9    138394717
chr11   135086622
chr10   133797422
chr12   133275309
chr13   114364328
chr14   107043718
chr15   101991189
chr16   90338345
chr17   83257441
chr18   80373285
chr20   64444167
chr19   58617616
chrY    57227415
chr22   50818468
chr21   46709983

The output of the same epic2 command as before, but with the updated chrom.sizes:

effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed
Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2085 * 1e6

Parsing ChIP file(s):
  RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed
Chromosome chr10_GL383545v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY
Chromosome chr10_GL383545v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY
...
Chromosome chrUn_KI270757v1 not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY
Chromosome chrX_KI270880v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY
Chromosome chrX_KI270913v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY

Valid ChIP reads: 24012551 (24017420 before out of bounds removal)

Score threshold: 27.632

Number of tags in a window: 5

Number of islands found: 6736

Empirical estimate of FDR is: 0.014845605700712588 (100/6736)

Hope this is helpful!

Best, Henry

endrebak commented 4 years ago

I will definitely try to fix this soon. Will test with non-standard chromosomes and see what happens.

On Tue, Jul 28, 2020 at 11:16 PM Henry Miller notifications@github.com wrote:

I am also seeing this error -- but I figured out a temporary workaround. I am using hg38 chrom.sizes from UCSC link https://hgdownload-test.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes and a paired-end fastq file which I aligned with bwa and converted to a bed file.

Here is the epic2 command:

epic2 -t RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed -e 100 -fdr .01 --effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed

And the output:

Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2167 * 1e6

Parsing ChIP file(s): RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed Traceback (most recent call last): File "/home/UTHSCSA/millerh1/miniconda3/envs/RSeqEnv/bin/epic2", line 248, in _main(args) File "/home/UTHSCSA/millerh1/miniconda3/envs/RSeqEnv/lib/python3.7/site-packages/epic2/main.py", line 40, in _main args["treatment"], args, "ChIP") File "epic2/src/reads_to_bins.pyx", line 183, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 309, in epic2.src.reads_to_bins.files_to_bin_counts File "epic2/src/reads_to_bins.pyx", line 61, in epic2.src.reads_to_bins.remove_out_of_bounds_bins OverflowError: Python int too large to convert to C long

The error goes away when I remove all the non-standard chromosomes from the chrom.sizes file, though I don't understand why. Regardless most people will have these non-standard chromosomes so it would be great if using them was supported.

Here's chrom.sizes modified by me:

chr1 248956422 chr2 242193529 chr3 198295559 chr4 190214555 chr5 181538259 chr6 170805979 chr7 159345973 chrX 156040895 chr8 145138636 chr9 138394717 chr11 135086622 chr10 133797422 chr12 133275309 chr13 114364328 chr14 107043718 chr15 101991189 chr16 90338345 chr17 83257441 chr18 80373285 chr20 64444167 chr19 58617616 chrY 57227415 chr22 50818468 chr21 46709983

The output of the same epic2 command as before, but with the updated chrom.sizes:

effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2085 * 1e6

Parsing ChIP file(s): RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed Chromosome chr10_GL383545v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY Chromosome chr10_GL383545v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY ... Chromosome chrUn_KI270757v1 not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY Chromosome chrX_KI270880v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY Chromosome chrX_KI270913v1_alt not in the chromosome sizes: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY

Valid ChIP reads: 24012551 (24017420 before out of bounds removal)

Score threshold: 27.632

Number of tags in a window: 5

Number of islands found: 6736

Empirical estimate of FDR is: 0.014845605700712588 (100/6736)

Hope this is helpful!

Best, Henry

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore-ntnu/epic2/issues/36#issuecomment-665287584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHURUULS44QJPLXKM365ZLR5452HANCNFSM4OYOFEEQ .

endrebak commented 4 years ago

I have now started my sleuthing to fix this problem.

There is one thing which I did not think of communicating: if you include chromosomes in the chromsizes file that are not in the reads, the algorithm becomes more conservative. It computes a Poisson parameter based on the number of reads in the genome. With more chromosomes, the genome becomes longer, and hence less sparsely populated.

I have tried:

but have been unable to break epic2. If anyone can upload a breaking example I will do my best to fix this problem. But I have not been able to reproduce. Version: 0.0.41. If the data is semi-sensitive you can 7zip with a password and send the file by e-mail.

endrebak commented 4 years ago
epic2 -bin 200 -g 3 -fdr 0.05 -t /mnt/cargo/endrebak/control.sorted.filtered.bam -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > /dev/null
Found a median readlength of 50.0

Using chromosome sizes found in Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes.

Using an effective genome length of ~2344 * 1e6

Parsing ChIP file(s):
  /mnt/cargo/endrebak/control.sorted.filtered.bam

Valid ChIP reads: 1949548 (1949548 before out of bounds removal)

Score threshold: 22.615000000000002

Number of tags in a window: 1

Number of islands found: 16028

Empirical estimate of FDR is: 0.062390816071874224 (1000/16028)

I.e. was not able to reproduce. Tomorrow I will wrap the affected code in a try/catch and print the error with detailed information. Then you can report that info back to me and I will try my best to understand and fix it :)

friederhadlich commented 4 years ago

Hi Endre,

please use treatment and control argument to reproduce the error ... i.e.

epic2 -t treatment.sorted.bam -bin 200 -g 3 -fdr 0.05 -fs 300 -egf 0.8 -c control.sorted.filtered.bam -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes

Good luck, Frieder

endrebak commented 4 years ago
epic2 -fs 300 -bin 200 -g 3 -fdr 0.05 -t /mnt/cargo/endrebak/treatment.sorted.filtered.bam -c /mnt/cargo/endrebak/control.sorted.filtered.bam -cs Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes > /dev/null
Found a median readlength of 50.0

Using chromosome sizes found in Bos_taurus.ARS-UCD1.2.dna.toplevel.chromsizes.

Using an effective genome length of ~2344 * 1e6

Parsing ChIP file(s):
  /mnt/cargo/endrebak/treatment.sorted.filtered.bam

Valid ChIP reads: 1185813 (1185813 before out of bounds removal)

Score threshold: 17.337

Number of tags in a window: 1

Number of islands found: 9343

Parsing Input file(s):
  /mnt/cargo/endrebak/control.sorted.filtered.bam

Valid Background reads: 1949548 (1949548 before out of bounds removal)

I still am unable to reproduce :) So I will stick to my plan of implementing good reporting in the case of errors.

I am using version 0.0.41 and linux. What about you people?

endrebak commented 4 years ago

Now a 0.0.42 is out which should give a better error message when it happens :).

friederhadlich commented 4 years ago

Strange! All my samples pass epic2 with version 0.0.42! Thx a lot for your support!

endrebak commented 4 years ago

Indexplicable weirdness is common 🙈 Thanks for reporting.

It might have been that I forgot to raise an exception if I caught one. Were any error messages?

endrebak commented 4 years ago

Look for an error message like this:

    except OverflowError as e:
        print(e)
        print("\nAdditional info:\n")
        print("Chromosome:", chromosome)
        print("Chromsize:", chromsize)
endrebak commented 4 years ago

0.0.43 (hotfix) should raise an error if OverflowErrors happen again. These were noisily ignored in 0.0.42

millerh1 commented 4 years ago

I am still getting this error with epic2 0.0.42... though it finishes and gives some additional info:

(base) millerh1@cbbi16:~/Bishop.lab/Projects/RSeq$ epic2 -t RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed -e 100 -fdr .01 --effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed
Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2167 * 1e6

Parsing ChIP file(s): 
  RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed 
Python int too large to convert to C long

Additional info:

Chromosome: chr2_GL582966v2_alt
Chromsize: 96131

Valid ChIP reads: 24120376 (24125321 before out of bounds removal)

Score threshold: 45.687

Number of tags in a window: 4

Number of islands found: 5165

Empirical estimate of FDR is: 0.01936108422071636 (100/5165) 

The chrom.sizes are here: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes

I'm hosting the bed file here: https://hem-misc.s3-us-west-2.amazonaws.com/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed

Thank you!

Henry

endrebak commented 4 years ago

I will look at it. Thanks!

On Fri, Aug 14, 2020 at 10:47 PM Henry Miller notifications@github.com wrote:

I am still getting this error with epic2 0.0.42... though it finishes and gives some additional info:

(base) millerh1@cbbi16:~/Bishop.lab/Projects/RSeq$ epic2 -t RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed -e 100 -fdr .01 --effective-genome-fraction 0.6753009149366569 --mapq 30 --chromsizes RSeq_out/tmp/hg38.chrom.sizes --output RSeq_out/peaks_epic_unstranded/SRX2481503_S96_DRIP_Seq_1_hg38.bed Found a median fragment size of 183.5

Using chromosome sizes found in RSeq_out/tmp/hg38.chrom.sizes.

Using an effective genome length of ~2167 * 1e6

Parsing ChIP file(s): RSeq_out/tmp/pe_beds/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed Python int too large to convert to C long

Additional info:

Chromosome: chr2_GL582966v2_alt Chromsize: 96131

Valid ChIP reads: 24120376 (24125321 before out of bounds removal)

Score threshold: 45.687

Number of tags in a window: 4

Number of islands found: 5165

Empirical estimate of FDR is: 0.01936108422071636 (100/5165)

The chrom.sizes are here: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes

I'm hosting the bed file here: https://hem-misc.s3-us-west-2.amazonaws.com/SRX2481503_S96_DRIP_Seq_1.hg38.experiment.bed

Thank you!

Henry

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore-ntnu/epic2/issues/36#issuecomment-674265922, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHURUWDMIUJYM2YXMY5GBLSAWPGDANCNFSM4OYOFEEQ .

endrebak commented 4 years ago

What about now? 0.0.44 is out :)

millerh1 commented 4 years ago

Well the number of reads and peaks does not change, but the error disappeared! I'm assuming that this is alright now. Thanks!

endrebak commented 4 years ago

No problem. There must have been some weirdness with the Cython and C++ interop is my guess :)