gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
361 stars 76 forks source link

Stringtie2 transcripts contain sequence from genome or reads mapped? #416

Closed sanyalab closed 2 months ago

sanyalab commented 5 months ago

Hello,

I am finding it hard to understand how Stringtie is constructing the transcripts using the RNA-Seq data. Here is the setup.

I am mapping RNA-Seq reads from GenomeA onto GenomeB using hisat2. Genome A and Genome B are 85% - 90% identical. Then I use the annotation of GenomeB.GTF as a guide and obtain a output.GTF file of stringtie transcripts. I am confused here. The coordinates are from GenomeB as I DO NOT have a GenomeA yet. So the transcripts that I will get using gffread will contain sequences from GenomeB. I want the reads assembled into transcripts that will contain GenomeA sequences. You mention genome-guided so I have used a closely related genome as a guide. But I want GenomeA transcripts.

Maybe I am missing something here. Can you please clarify?

Thanks Abhijit