smithlabcode / ribotricer

A tool for accurately detecting actively translating ORFs from Ribo-seq data
http://doi.org/djv4
GNU General Public License v3.0
28 stars 8 forks source link

Ribotricer Output #143

Closed bshim181 closed 3 weeks ago

bshim181 commented 10 months ago

Hello,

I am trying to merge two list of ORFs predicted by the RibORF algorithm and Ribotricer Algorithm. I am aware that RibORF predicts ORFs at the transcript level while the Ribotricer predicts ORFs at the exonal level. I would like to create a bed file where it encompasses predictions from both outputs. Is there way to merge two ORFs predicted at different level (transcript vs exonal)?

Is there a feature where I could transform the exonal predictions by RibORF to transcript level predictions? Also does it make sense to input an offset corrected bam file (corrected by ribORF) for ribotricer input?

saketkc commented 10 months ago

My suggestion would be to use the same annotation file for both. That is, when you generate ribotricer index, you should be able to modify it and use as in index to RibORF. I haven't used RibORF for a while, but when we benchmarked, this is the strategy we used.

bshim181 commented 10 months ago

This is the ORF annotation file for ribotricer. I was wondering what the coordinates of the last column stand for.

Screenshot 2023-08-31 at 9 20 33 AM

This is the ORF annotation for the same transcript_ID by ribORF. I was wondering how we would know the start and end coordinate of the parent transcript (not each of its isoforms) which is specified in the ribORF annotations.

Screenshot 2023-08-31 at 9 22 01 AM
saketkc commented 9 months ago

The last column are the coordinates of the that ORF - comma separated values indicate the start and end points (these are exons and hence not continuous).

Hope that helps!

bshim181 commented 9 months ago

Hello,

I have been working to transform the Ribotricer index to the RibORF index. I have noticed that the exon boundaries coordinates are not multiple of 3 while RibORF index demonstrates transcript level annotations with transcript length being multiple of 3. Would this cause issues? I have generated the prediction with RIbORF algorithm with the transformed index. It is, however, difficult to verify whether it was converted accurately.

I was wondering if you have converted RibORF annotations or the code use to generate the conversion from one index to the another, would you be able to share that with us? or would that be difficult?

andrewdavidsmith commented 3 weeks ago

@bshim181 I'm going to close this issue because it seems that your most recent question is a separate issue. Feel free to open a separate issue on that.