Open AmrSaadeldin opened 2 years ago
I updated stringtie to V2.2.1 - But now I retrieve new isoforms in -e mode.
Indeed v2.2.0 had a bug with not showing all the input transcripts, as you observed. However v2.2.1 should have fixed that -- but now you seem to report that new isoforms are produced when -e
option is used, which should not happen..
Can you please provide more information about your findings ? Are you getting that with long reads data (-L
option), or hybrid data (--mix
) ?
An example dataset to reproduce the issue would be greatly appreciated.
Thank you for sharing the example data -- apparently with -L -e
sometimes StringTie writes out multiple abundance estimates for the same transcript ID.
[EDIT: the outputs are not duplicated, but rather independent "predicitions"]
Debugging note: in one such case with a duplicate single-exon transcript estimate, I see in print_predcluster()
3 instances of the same transcript ID passed in the pred
list, 2 of them having 0 coverage and different exon coordinates (shorter exon, contained in the real (input) exon), the 3rd instance being the real one with non-zero coverage and correct exon coordinates.
Hello, Is the duplicate transcript ID bug the reason that some of the samples are showing abundance value as zero? Please see below: This is not a single-exon transcript.
[1] 2500.7836 1425.0903 0.0000 2431.1512 480.4571 0.0000 933.6469 0.0000 [9] 0.0000 0.0000 813.9988 3317.5881 762.0982 778.2828 682.3964 0.0000 [17] 1742.9656 306.2472 654.1184 0.0000 434.4180 0.0000 0.0000 612.1603 [25] 0.0000 350.2606 325.1392 0.0000 1381.3899 1218.0039 0.0000 1082.9444 [33] 0.0000 1149.3413 0.0000 0.0000 1750.8320 0.0000 1201.1724 2905.0471 [41] 0.0000 833.6149
This is using tximport, but prepde.py3 is showing the same pattern, where some samples are showing value of zero (even when the treatment is the same). So this was using the -L and -e option.
Hi,
I am using the latest version (2.2) of Stringtie and while using the expression estimation mode (-e) I lost ~1k transcripts compared to the denovo assembled GTF file. Also, the number of transcripts estimated (-e output files) are different and not the same when I used the same -G GTF file!
I used the same files with an older version of stringtie and didn't face this problem!