wrf / genomeGTFtools

convert various features into a GFF-like file for use in genome browsers
69 stars 27 forks source link

Add the mRNA, cds features into the Stringtie outfiles? #16

Closed Huangyizhong closed 2 years ago

Huangyizhong commented 3 years ago

Hi, there. Since I have used the stringtie merge parameters to combined all the gtf file together. As we can see that there are just the transcripts and exons features in the final files. image Is there scripts that can add the gene, mRNA and CDS features into this files, which make it looks like the following files? image Thanks advance! Sincerely Yizhong Huang

wrf commented 3 years ago

Hello, A "normal" GFF would have gene-mRNA-exon-CDS, with column 9 as ID=something;Parent=somethingelse; You would only get CDS if you know the protein. There is a script in misc that can convert the "gene_ID/transcript_ID" format of stringTie into "ID=;Parent=", called stringtie_gtf_to_gff3.py. Otherwise you should predict the proteins and frames using a program like TransDecoder, and follow all of their steps. Best WRF

Huangyizhong commented 3 years ago

Hi, thanks so much for your kind help. Yes, you are right. I used the stringtie to merge the EVM model results and liftoff results (all for genome annotation), and the EVM model results included the transdecoder parts. All I want to do is just add the gene, mRNA and CDS feature into the files, which which make it looks like the second files.