gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
361 stars 76 forks source link

Gene read count generated from prepDE.py for DESeq2 has duplicate genes #431

Open FarzanehRah opened 1 month ago

FarzanehRah commented 1 month ago

Hi,

I'm working on a project where the researcher provided me with two tables for transcript and gene read counts prepared by HISAT, StringTie, and prepDE.py. I have completed the DESeq2 analysis for the transcript table, but when I ran the analysis for the gene table, the results were unusual—only 3 DEGs were identified for the two conditions. Upon investigating the gene read count table, I noticed duplicated gene names. This duplication is expected for transcripts, but is it also expected for genes? Should I sum up the counts for these duplicated genes? Additionally, is it normal to have gene names in the transcript table? In the transcript table, I observed a mix of gene and transcript names.

Thanks, Farzaneh