gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
365 stars 76 forks source link

Can stringtie add UTR to guided transcripts? #333

Open krabapple opened 3 years ago

krabapple commented 3 years ago

I input an unpublished reference assembly to stringtie with the -G option . The reference gtf has no UTR data; the transcript boundaries in every case are coterminous with the 5' and 3' coordinates of the initial and terminal CDS, respectively. I was hoping stringtie could 'reveal' leading and trailing UTR candidate regions based on RNAseq read coverage, but this did not happen; in every case where stringtie assembled a transcript guided by a reference, the transcript boundary never extended beyond the reference boundaries -- even when there was plentiful RNASeq read coverage there (visible with a genome browser).

Is there a way to allow guided stringtie predictions to extend beyond the gtf boundaries?

andreaswallberg commented 2 years ago

Hi @krabapple!

I am interested in the same problem. Did you solve it?

dmworstell commented 1 year ago

I am running into a similar issue, where areas with plenty of RNAseq coverage in IGV are not being properly included in the assembly, even when I make the command parameters hyper-sensitive (realistically, far too sensitive to be useful).

Here's an example where an exon that likely should be in the 5' UTR of the SERPING1 gene isn't being included:

Screenshot 2022-11-08 at 4 54 25 PM

This is obtained using the following command: stringtie -o output_stringtie.gtf -G hg38_genes.gtf -v -m 30 -a 5 -t -j 0.01 -c 0.01 -s 1 -M 1 -f 0 -g 0 -u -p 8 filename.bam

This is with stringtie version 2.2.1 The reads were mapped to hg38 with Hisat2 version 2.2.1, and specifically the reads mapping across the junction that is aberrantly not included in the stringtie output include a XS tag ("+" in this case).

Unfortunately, unguided predictions using stringtie have the same problem:

image

Were you able to solve this problem?

ddifraia commented 3 weeks ago

I follow up on this topic. I know is a couple of years late, but has anyone actually solved this?