Open cryptic0 opened 5 years ago
Hi,cryptic0,did you sovle your question?I got the same question.
You can additionally use the -e
and -B
flags to restrict transcript assembly to those known and present in the refGTF. However, that still won't get rid of the MSTRG ids. AFAIK, these internal IDs are assigned when a discovered transcript only partially overlaps the known transcript. If stringtie
is not 100% certain (by way of full overlap) of the identity of transcript, it is going to assign the MSTRG
tag, which are always numbered serially.
The only workaround I know of is to do this manually with Unix sed
or awk
. If you search on biostars, there is some explanation by Geo of this internal mapping ID assignment, and he might chime in here as well.
I have been running through the protocol described in Pertea et al 2016. The stringtie function properly imports the gene and transcript IDs from the reference annotation. However, during the
stringtie --merge
step, it converts them both to MSTRG IDs.Is there a way to avoid this conversion to keep original reference IDs? I provided the reference GTF during this step using the -G flag. Also, I saw some threads on biostars that indicated that one also needs to use the
-l
flag, but all that's going to do is use a custom prefix rather than MSTRG which is a non-solution solution.Here is my commandline: