biocore-ntnu / epic2

Ultraperformant reimplementation of SICER
https://doi.org/10.1093/bioinformatics/btz232
MIT License
55 stars 9 forks source link

Epic2-df #14

Closed Victor21v closed 5 years ago

Victor21v commented 5 years ago

Hi,

I am trying to run epic2-df as you fixed it, but when I run it, it displays me the next log with the following error:

"PyRanges is needed to use epic2-df, but is not included in epic2 to make the package lightweight on requirements: pip install pyranges Running epic2 on KO. Found a median readlength of 161.0

Using genome sacCer3.

Using an effective genome length of ~11 * 1e6

Parsing ChIP file(s): WT_Sorted.bedpe

Valid ChIP reads: 15163188 (15172397 before out of bounds removal)

Traceback (most recent call last): File "/home/usuario/anaconda3/bin/epic2-df", line 307, in bins_counts_ko = _main(args) File "/home/usuario/anaconda3/lib/python3.6/site-packages/epic2/main.py", line 46, in _main chip_count, args["bin_size"], effective_genome_length, args["gaps_allowed"] * args["bin_size"], args["e_value"]) File "epic2/src/SICER_stats.pyx", line 385, in epic2.src.SICER_stats.compute_score_threshold File "epic2/src/SICER_stats.pyx", line 221, in epic2.src.SICER_stats.Background_island_probscore_statistics.find_island_threshold File "epic2/src/SICER_stats.pyx", line 168, in epic2.src.SICER_stats.Background_island_probscore_statistics.background_island_expectation IndexError: list index out of range"

I have already installed pyranges, so I don't know what might be wrong.

Thanks in advance!

endrebak commented 5 years ago

Thanks for reporting.

0) That error message about pyranges is a red herring. Why it shows up, I do not know. It has nothing to do with this error.

1) This happens in the main epic2 algorithm, not in the epic2-df specific parts, so it should also fail if you run the WT and/or KO individually with the regular epic2 script, right?

2) If your file is similar to mine, the error happens here:

https://github.com/biocore-ntnu/epic2/blob/master/epic2/src/SICER_stats.pyx#L168

i = self.min_tags_in_window;
while ( int(round(index - self.window_score[i]/self.bin_size))>=0): ### this line 

It is only the self.window_score[i] which could lead to that error. Is that line in your traceback?

If so this error happens in a part that is unchanged from SICER, so I doubt that the bug originates there.

3) Do you have the possibility of sharing that file with me? A private google drive or dropbox link? Then it would likely be easy to debug :)

endrebak commented 5 years ago

4) Also can you do

cut -f 1  WT_Sorted.bedpe | sort | uniq

and send the output?

Since your file is sorted, it is likely to originate from bam. And bam files often use different chromosome names than those found in UCSC (1 instead of chr1 etc.) Dunno if this might have something to do with it.

This is what the sacCer3 chromosomes are called in epic2:

chrIV   1531933
chrXV   1091291
chrVII  1090940
chrXII  1078177
chrXVI  948066
chrXIII 924431
chrII   813184
chrXIV  784333
chrX    745751
chrXI   666816
chrV    576874
chrVIII 562643
chrIX   439888
chrIII  316620
chrVI   270161
chrI    230218
chrM    85779

Again, thanks for helping me debug this. I am going away for easter vacation for a week, but this is very high on my list of priorities :)

endrebak commented 5 years ago

Actually, ignore 4. If you used different names epic2 would have errored before.

I think the bug is here:

https://github.com/biocore-ntnu/epic2/blob/master/epic2/src/SICER_stats.pyx#L19

# Precalculate the poisson, cumulative poisson values up to max (500, 2*self.average) .
self.max_index = max (500, int(2*self.average)) ;

This is something from the original SICER, before datasets were huuuuge like today. It arbitrarily sets the precomputed number of cumulative poisson values to X, and if you need a higher value it just fails. I am hesitant to change this as that whole piece of code is hard to understand. Will play around with it a bit :)

endrebak commented 5 years ago

Now I added the feature

--experimental-statistics

to epic2. Try it and see if it works. If so, this fixes a bug in the original SICER algorithm :)

pip install epic2==0.0.28.

Victor21v commented 5 years ago

Hi,

Thanks for the responses. I have updated epic2 version and it still displays the same line errors (although they are different line numbers now):

Traceback (most recent call last): File "/home/usuario/anaconda3/bin/epic2-df", line 337, in bins_counts_ko = _main(args) File "/home/usuario/anaconda3/lib/python3.6/site-packages/epic2/main.py", line 54, in _main args["gaps_allowed"] * args["bin_size"], args["e_value"]) File "epic2/src/SICER_stats.pyx", line 401, in epic2.src.SICER_stats.compute_score_threshold File "epic2/src/SICER_stats.pyx", line 232, in epic2.src.SICER_stats.Background_island_probscore_statistics.find_island_threshold File "epic2/src/SICER_stats.pyx", line 176, in epic2.src.SICER_stats.Background_island_probscore_statistics.background_island_expectation IndexError: list index out of range

I may share with you the files, but they are weight more than 1 GB per file.

endrebak commented 5 years ago

You did not use the flag —experimental-statistics it seems. Am I right? :)

Victor21v commented 5 years ago

I didn't. Now I did and it displays new error message:

Traceback (most recent call last): File "/home/usuario/anaconda3/bin/epic2-df", line 353, in ko = pd.read_csv( NameError: name 'pd' is not defined

Victor21v commented 5 years ago

It displays that error but files were created so I'm not really sure whether this error could potentially affect the process

endrebak commented 5 years ago

Yay! This means I fixed a bug in the original SICER algorithm. The new weirdness is due to me and I will fix it later today :)

Thanks for the help 🙏🏻

Endre

On Friday, April 12, 2019, Victor21v notifications@github.com wrote:

It displays that error but files were created so I'm not really sure whether this error could potentially affect the process

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore-ntnu/epic2/issues/14#issuecomment-482568145, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ9I0jSpEEVXCD1mKMa-jc0ziedY11URks5vgIV7gaJpZM4cqr0g .

endrebak commented 5 years ago

pip install epic2==0.0.29. You might want to do pip install --upgrade pyranges also.

Thanks for trying it out.