Open Tang-pro opened 8 months ago
Hello, @apcamargo
Here I got the protein sequence obtained by Rnasamba, but there is no cds sequence. I used TransDecoder to predict the ORF, and used the protein sequence obtained by Rnasamba as the BLASTP database, integrated it into and selected the most likely CDS from it. Is this method feasible?
Looking forward to your reply, Thank you!
Yes. That makes sense. RNAsamba just takes the longest CDS in the transcript. Trnadecodrr will give you good results
Hi, @apcamargo Excuse me again Through this method, the original protein sequences predicted by RNAsamba were 174,084, but the cds sequences obtained through TransDecoder were only 171,513. How should we understand this? Is it feasible?
This could be because RNAsamba is good in identifying truncated transcripts, which might not appear if you require complete ORFs in transdecodor. Another option is that transdecodor is applying a couple of filters that are removing a couple of ORFs.
If you just want ORFs for these transcripts, you can use a ORF extractor tool, such as OrfM or seqkit.
Hi, @apcamargo
I got it, Thank you!
Hi, @apcamargo
Hello, the results I ran here only have protein sequences and no cds sequences. Can this software get the corresponding cds sequence codes?