Open ajw2329 opened 7 years ago
I can confirm this bug, and would really love for it to be fixed since it prevents me from converting the GTF file to bigBed for display in the UCSC Genome Browser. This exact same thing happens to me in v1.3.2 and v1.3.3.
Example command:
stringtie ce10/sample-passing_reads/ko_mutant_rep3.bam -b ce10/sample-stringtie_merged/ko_mutant_rep3 -p 1 -o ce10/sample-stringtie_merged/ko_mutant_rep3.gtf -C ce10/sample-stringtie_merged/ko_mutant_rep3_cov_refs.gtf -G ce10/sample-stringtie_merged/all-samples.gtf -A ce10/sample-stringtie_merged/ko_mutant_rep3_gene_abundance.tab -l ko_mutant_rep3 -e &> logs/ce10/sample-stringtie_merged/ko_mutant_rep3.log
Then, the file ko_mutant_rep3_cov_refs.gtf
has several lines like this (retrieved using cat ko_mutant_rep3_cov_refs.gtf | grep -n -C 1 "^;" -
)...
9556-chrI StringTie transcript 5566281 5570080 1000.00 - . ID=NOVEL.1005.1chrI StringTie exon 5300850 5301908 1000.00 - . Parent=NOVEL.930.1
9557:;geneID=NOVEL.1005
9558-chrI StringTie exon 5302304 5302650 1000.00 - . Parent=NOVEL.930.1
--
9612-chrI StringTie exon 5302862 5303171 1000.00 - . Parent=NOVEL.930.4
9613:;geneID=NOVEL.1005;gene_name=pqn-52
9614-chrI StringTie exon 5303288 5304351 1000.00 - . Parent=NOVEL.930.4chrI StringTie exon 5568637 5568747 1000.00 - . Parent=NM_059228
--
16769-chrI StringTie exon 8431246 8431349 1000.00 - . Parent=NOVEL.1789.1
16770:;geneID=NOVEL.1791chrI StringTie exon 8431393 8431465 1000.00 - . Parent=NOVEL.1789.1
16771-chrI StringTie exon 8431529 8431650 1000.00 - . Parent=NOVEL.1789.1;gene_name=F10D11.2
--
26381-chrI StringTie transcript 11935386 11939629 1000.00 + . ID=NOVEL.2880.1chrI StringTie exon 11942946 11945762 1000.00 . . Parent=NOVEL.2881.1
26382:;geneID=NOVEL.2880
26383-chrI StringTie exon 11935386 11939629 1000.00 + . Parent=NOVEL.2880.1
--
27757-chrI StringTie transcript 13082610 13083110 1000.00 - . ID=NOVEL.3108.1;geneID=NOVEL.3108chrI StringTie transcript 13101345 13101879 1000.00 + . ID=NM_001129064
27758:;geneID=NOVEL.3112;gene_name=Y26D4A.21
27759-chrI StringTie exon 13082610 13082896 1000.00 - . Parent=NOVEL.3108.1
--
39446-chrII StringTie transcript 6517379 6519629 1000.00 - . ID=NM_001026871chrII StringTie transcript 6527242 6528686 1000.00 - . ID=NOVEL.4917.2;geneID=NOVEL.4917
39447:;geneID=NOVEL.4896chrII StringTie exon 6527242 6528686 1000.00 - . Parent=NOVEL.4917.2
39448:;gene_name=toe-2
39449-chrII StringTie exon 6517379 6517489 1000.00 - . Parent=NM_001026871
Hello,
Thanks for writing such a great tool!
I encountered a minor issue just now in which the GTF file output by the "-C" option has a few concatenated lines - e.g:
chr1 HAVANA exon 98890972 98893104 . - . Parent=ENST00000263177.4chr1 HAVANA mRNA 100266207 100292769 . + . ID=ENST00000370128.8;CDS=100266376-100291502;geneID=ENSG00000137996.12;gene_name=RTCA
(i.e. the entry for the final exon in a transcript is concatenated to the entry for the subsequent transcript)
I am running stringtie with the following command:
stringtie "$i" -o "$outdir"/"$i"_stringtie_ref_only.gtf -G "$gencode_gtf" --rf -p 10 -C "$outdir"/"$i"_stringtie_ref_only_complete_coverage.gtf -e
Where the gencode.v26 primary annotation file (gencode.v26.annotation.gtf) is provided to "-G".
Thanks again! Best, Andrew