I have ~ 2000 transcripts that don't get given a class code when running gff compare. I am comparing the gencode.gtf with my stringtie_merged.gtf.
I'm confused as when I produce a transcript count matrix (via Stringtie/prepDE.py) there are transcripts that have lots of counts but supposedly aren't present in stringtie_merged.gtf.
eg ENST00000632434.1 has lots of read counts for all samples and has 'NA' class code.
When I grep ENST00000632434.1 from stringtie_merged.gtf I get an MSTRG id:
chr5 StringTie transcript 116085026 116293286 1000 + . gene_id "MSTRG.32862"; transcript_id "ENST00000632434.1"; gene_name "COMMD10"; ref_gene_id "ENSG00000145781.8";
When I then look at the MSTRG.32862 transcript in my count matrix is has very low counts.
If ENST00000632434.1 is present in all samples (as indicated by the transcript count matrix) then why isn't it being given a class code??
I have ~ 2000 transcripts that don't get given a class code when running gff compare. I am comparing the gencode.gtf with my stringtie_merged.gtf.
I'm confused as when I produce a transcript count matrix (via Stringtie/prepDE.py) there are transcripts that have lots of counts but supposedly aren't present in stringtie_merged.gtf. eg ENST00000632434.1 has lots of read counts for all samples and has 'NA' class code. When I grep ENST00000632434.1 from stringtie_merged.gtf I get an MSTRG id:
chr5 StringTie transcript 116085026 116293286 1000 + . gene_id "MSTRG.32862"; transcript_id "ENST00000632434.1"; gene_name "COMMD10"; ref_gene_id "ENSG00000145781.8";
When I then look at the MSTRG.32862 transcript in my count matrix is has very low counts. If ENST00000632434.1 is present in all samples (as indicated by the transcript count matrix) then why isn't it being given a class code??