gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
360 stars 76 forks source link

Concatenated lines in "-C" output GTF: stringtie v1.3.3b #134

Open ajw2329 opened 7 years ago

ajw2329 commented 7 years ago

Hello,

Thanks for writing such a great tool!

I encountered a minor issue just now in which the GTF file output by the "-C" option has a few concatenated lines - e.g:

chr1 HAVANA exon 98890972 98893104 . - . Parent=ENST00000263177.4chr1 HAVANA mRNA 100266207 100292769 . + . ID=ENST00000370128.8;CDS=100266376-100291502;geneID=ENSG00000137996.12;gene_name=RTCA

(i.e. the entry for the final exon in a transcript is concatenated to the entry for the subsequent transcript)

I am running stringtie with the following command: stringtie "$i" -o "$outdir"/"$i"_stringtie_ref_only.gtf -G "$gencode_gtf" --rf -p 10 -C "$outdir"/"$i"_stringtie_ref_only_complete_coverage.gtf -e

Where the gencode.v26 primary annotation file (gencode.v26.annotation.gtf) is provided to "-G".

Thanks again! Best, Andrew

cbp44 commented 5 years ago

I can confirm this bug, and would really love for it to be fixed since it prevents me from converting the GTF file to bigBed for display in the UCSC Genome Browser. This exact same thing happens to me in v1.3.2 and v1.3.3.

Example command: stringtie ce10/sample-passing_reads/ko_mutant_rep3.bam -b ce10/sample-stringtie_merged/ko_mutant_rep3 -p 1 -o ce10/sample-stringtie_merged/ko_mutant_rep3.gtf -C ce10/sample-stringtie_merged/ko_mutant_rep3_cov_refs.gtf -G ce10/sample-stringtie_merged/all-samples.gtf -A ce10/sample-stringtie_merged/ko_mutant_rep3_gene_abundance.tab -l ko_mutant_rep3 -e &> logs/ce10/sample-stringtie_merged/ko_mutant_rep3.log

Then, the file ko_mutant_rep3_cov_refs.gtf has several lines like this (retrieved using cat ko_mutant_rep3_cov_refs.gtf | grep -n -C 1 "^;" -)...

9556-chrI       StringTie       transcript      5566281 5570080 1000.00 -       .       ID=NOVEL.1005.1chrI     StringTie       exon    5300850 5301908 1000.00 -       .       Parent=NOVEL.930.1
9557:;geneID=NOVEL.1005
9558-chrI       StringTie       exon    5302304 5302650 1000.00 -       .       Parent=NOVEL.930.1
--
9612-chrI       StringTie       exon    5302862 5303171 1000.00 -       .       Parent=NOVEL.930.4
9613:;geneID=NOVEL.1005;gene_name=pqn-52
9614-chrI       StringTie       exon    5303288 5304351 1000.00 -       .       Parent=NOVEL.930.4chrI  StringTie       exon    5568637 5568747 1000.00 -       .       Parent=NM_059228
--
16769-chrI      StringTie       exon    8431246 8431349 1000.00 -       .       Parent=NOVEL.1789.1
16770:;geneID=NOVEL.1791chrI    StringTie       exon    8431393 8431465 1000.00 -       .       Parent=NOVEL.1789.1
16771-chrI      StringTie       exon    8431529 8431650 1000.00 -       .       Parent=NOVEL.1789.1;gene_name=F10D11.2
--
26381-chrI      StringTie       transcript      11935386        11939629        1000.00 +       .       ID=NOVEL.2880.1chrI     StringTie       exon    11942946        11945762        1000.00 .       .       Parent=NOVEL.2881.1
26382:;geneID=NOVEL.2880
26383-chrI      StringTie       exon    11935386        11939629        1000.00 +       .       Parent=NOVEL.2880.1
--
27757-chrI      StringTie       transcript      13082610        13083110        1000.00 -       .       ID=NOVEL.3108.1;geneID=NOVEL.3108chrI   StringTie       transcript      13101345        13101879        1000.00 +       .       ID=NM_001129064
27758:;geneID=NOVEL.3112;gene_name=Y26D4A.21
27759-chrI      StringTie       exon    13082610        13082896        1000.00 -       .       Parent=NOVEL.3108.1
--
39446-chrII     StringTie       transcript      6517379 6519629 1000.00 -       .       ID=NM_001026871chrII    StringTie       transcript      6527242 6528686 1000.00 -       .       ID=NOVEL.4917.2;geneID=NOVEL.4917
39447:;geneID=NOVEL.4896chrII   StringTie       exon    6527242 6528686 1000.00 -       .       Parent=NOVEL.4917.2
39448:;gene_name=toe-2
39449-chrII     StringTie       exon    6517379 6517489 1000.00 -       .       Parent=NM_001026871