Closed Klim314 closed 6 years ago
RATs does not process or interpret the IDs in any way. Any string is used 'as is'. As such, the IDs in your annotation must match exactly those in the quantification files. It is your responsibility to ensure that the same IDs are used across all the analysis steps. Notice the section about Annotation Discrepancies in the input vignette. RATs will use the provided annotation as its guide. Any IDs in the annotation, not matched exactly in the quantifications will be assumed to have 0 expression. Any IDs in your quantifications that do not match the annotation will be ignored completely.
Yes it is the same "problem" as the one reported for sleuth. From my perspective this is a user error, not a program error. I consider as a liability any code "magic" that assumes a certain ID format and changes the provided IDs to conform to that presumed format. I don't think a program should be taking such initiative, because if the presumption is wrong, then the result will be worthless and the error may go unnoticed. I want RATs to work with any format of ID, including non-official formats, so automatically messing with the provided IDs is not a good idea.
If however, you did use the same annotation, but Kallisto chopped off the version numbers, thus creating the mismatch in the IDs, then I may need to consider adding some optional ID "magic", as it is not really a user error if a third party program edits the IDs.
It wasn't clear from your question, what form of IDs are in your annotation and what form are in your quantifications and whether the same annotation file was used for quantification and DTU.
Hi! Do you have anything to add to this issue? Did you resolve the problem?
Thanks! Kimon
With Ensembl annotations for kallisto quantifications, RATS will produce a solely of NA results due to the Ensembl ".N" version numbers
Looking at the Genes, all genes/transcripts fail to be detected by RATS as follows
Examining the raw data reveals this to be due to the Ensembl gene/transcript version numbers. Stripping the .N suffix resolves this issue.
The issue seems similar to that faced by Patcher's Sleuth here: https://github.com/pachterlab/sleuth/issues/58