Closed Buuntu closed 5 years ago
Hi Buuntu,
if you have no splices (i.e. all transcript are single-exon), you do not need to use the annotations (GTF file) at all. Or, indeed, you can use a non-splice-aware aligner like bwa or bowtie. If you want to count reads per gene, you can rename all features in your GTF as "exon".
Cheers Alex
How would I add annotations and have meaningful gene names when I get to differential expression analysis without a GTF file? Otherwise the transcripts will not have any kind of meaningful name associated with them and just arbitrary names. My FASTA file doesn't have the gene names since it is not transcripts but a genome index.
I think bacteria do have some splices (group II introns) just not very many so maybe STAR would still have a better alignment?
Hi Gabriel,
how are you calculating gene expression? If you are using STAR's --quantMode GeneCounts option, you would need to provide the GTF file. In your GTF file, you need to replace the features in column 3 with "exon" and re-generate the genome.
Cheers Alex
I was going to use something like DeSeq2 to get the actual gene counts from the alignments. I haven't gotten to that step yet.
For Deseq2 you will need to generate the table with reads counts per gene.
Solution: check the formatting of the GTF file, it must contain some lines with exon in the 3rd column. Make sure the GTF file is unzipped. If exons are marked with a different word, use --sjdbGTFfeatureExon .
I'm trying to use STAR to align RNA-seq data to a bacterial genome. Because the annotations were generated with Prokka (https://github.com/tseemann/prokka), there are no exons per say. Maybe the closest thing to an exon is CDS in bacteria, but I don't want it to leave out the ones that are annotated as rRNA or tRNA in my GTF. Maybe using STAR for a bacterial alignment is not even necessary since bacteria don't have introns and I can just be using something like bowtie2?
I know I can change the
--sjdbGTFfeatureExon
option, but will I be missing transcripts if I only use the CDS features?This is the exact error I'm getting:
And I confirmed that my GTF file only has CDS, rRNA, and tRNA features