dieterich-lab / rp-bp

Rp-Bp is a Bayesian approach to predict, at base-pair resolution, ribosome occupancy and translation.
MIT License
7 stars 5 forks source link

ValueError: The periodic offsets file was found, but no periodic lengths were found. #152

Closed jdcla closed 1 year ago

jdcla commented 1 year ago

Running the full Rp-bp pipeline causes me to run into this error for multiple datasets I own.

An example file for which the current error occurs:

length,highest_peak_peak,highest_peak_profile_sum,highest_peak_offset,highest_peak_bf_mean,highest_peak_bf_var
33.0,13.0,22.0,0.0,4.079312800000025,23.72223204436856
34.0,43.0,151.0,-18.0,12.269438,1.9872847081960805
35.0,12.0,61.0,-12.0,9.1025342,1.888710285887
36.0,16.0,52.0,0.0,6.7186293999999975,1.5314019861118
37.0,6.0,33.0,-8.0,0.1320677999999979,1.62079474609516
38.0,11.0,13.0,-9.0,30.894826000000023,25.176238308779997
39.0,1.0,2.0,-15.0,-60.746032,18.072419163520003
40.0,1.0,1.0,-16.0,31.887744,57.979003808032
41.0,1.0,2.0,-19.0,-60.746032,18.072419163520003
42.0,0.0,0.0,-20.0,-14.057254000000029,37.20296362789998
43.0,0.0,0.0,-20.0,-14.057254000000029,37.20296362789998
45.0,1.0,1.0,0.0,31.887744,57.979003808032
51.0,0.0,0.0,-20.0,-14.057254000000029,37.20296362789998

The command I run for the pipeline:

run-all-rpbp-instances ribo/${dataset}/out/rpbp/rpbp.yml --num-cpus 22 --logging-level INFO --mem 50G

However, I feel like this issue is not caused by something particular I'm doing wrong, but could rather be fixed by adjusting a parameter that improves the xxx_-unique.periodic-offsets.csv file. I am however at a loss right now.

eboileau commented 1 year ago

Hi @jdcla if you want to report a bug, you need to use the template. On the bug tracker, "New issue" -> "Get started" (Bug report), fill the report accordingly. I cannot help you without minimal information needed to trace the problem. Please increase the logging level to DEBUG before opening an issue.

Looking at the periodic-offsets.csv file you pasted above, the highest peak you get is 151, so you do not even pass the min_metagene_profile_count = 1000 filter. The problem is not with Rp-Bp, but with your data. This could suggest that your library didn't work, i.e. there is no evidence to support periodicity, unless maybe this is a downsampled data, etc.

Nevertheless, if you really want to use this data, or have good reasons to do this, there are a few options. You can change the default params in the config, or use fixed lengths and offsets, see Rp-Bp parameters.

jdcla commented 1 year ago

Thank you eboileau, the problem was indeed with the data.