smithlabcode / ribotricer

A tool for accurately detecting actively translating ORFs from Ribo-seq data
http://doi.org/djv4
GNU General Public License v3.0
28 stars 8 forks source link

[BUG] Weird results when using profile from Ribotricer output ? #138

Closed polklin closed 1 year ago

polklin commented 1 year ago

Description

I wanted to generate a visual plot of reads assigned to each frame for several of my ORFs candidates, using profile column in the Ribotricer output file.

For example I tried with this ORF candidate, found on the reverse strand: image

What I Did

I used the notebook example you kindly provide: https://github.com/smithlabcode/ribotricer/blob/master/notebooks/Plotting_ribotricer_profile.ipynb

profile = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 2, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 3, 0, 0, 6, 3, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0]
# defined in the notebook
plot_framewise_counts(pd.Series(profile, index=range(1, len(profile)+1)))

I get this plot:

image

This ORF is predicted as translating (phase_score > human_threshold), but I do not understand how that is possible since Frame 3 is predominant over Frame 1 in nearly positions.

I am also wondering if the profile should be read from right to left when the ORF is one the reverse strand ?

Thanks for your help and ribotricer ! Best, Paul

saketkc commented 1 year ago

@polklin ribotricer looks for continuous high-low-low pattern over the ORF. In your case, the high-low-low pattern seems to be present with the majority of reads arising from Frame 3. Ribotricer is agnostic to the phase in which this pattern occurs.

You can use other measures to subset good quality ORFs: total reads and whether or not there are reads at the start codon.

polklin commented 1 year ago

Hello @saketkc, thank you very much for the explanation. From your experience, do you have any recommendations about the minimal thresholds that should be used to get good quality ORFs ?

For reads at the start codon, only Frame 1 should be considered or other frames should be taken into account too ?