GenomeRIK / tama

Transcriptome Annotation by Modular Algorithms (for long read RNA sequencing data)
GNU General Public License v3.0
125 stars 24 forks source link

TAMA ORF seeker #96

Closed bijendrabio closed 1 year ago

bijendrabio commented 1 year ago

Hello, Just curious to know how TAMA ORF seeker identifies sense (+) and antisense (-) ORFs?. Couldn't able to retrieve the same predicted (-) ORFs using other ORF finder. Any suggestions why this may be the case.

Also, for the given headers of ORF,

G1;G1.1::1:29-2665(+):F1:0:719:1:239:239:I G1;G1.1::1:29-2665(+):F1:132:719:45:239:195:M G1;G1.1::1:29-2665(+):F3:2:64:1:20:20:F

Does :I and :F refers to incomplete ORF i.e. without start codons? Kindly suggest!

Regards, B

GenomeRIK commented 1 year ago

Hello B,

Thank you for using TAMA!

TAMA ORF Seeker only assesses based on the sense frames (3 frames). It assumes that you ran the full ORF/NMD pipeline where the strandedness of the sequence should have been resolved from the transcript model. If you do not know the strandedness of the model and would like to assess both strands you would need to create the reverse complement of the sequence to assess the antisense frames.

Regarding the last field in the header, you are correct. Those represent the first codon in the frame.

Thank you, Richard