Closed TomSmithCGAT closed 2 years ago
Hi Tom,
Thanks for that. Indeed the other way around was a recent fix. I did not think about or implement a fix for when the tRX gene is the parent! Thanks for pointing this out.
Hi Tom,
Finally got around to fixing this! I just pushed some new changes to this repo that should fix the issue. I couldn't find any datasets of ours that could recreate the unsplit Gly-CCC example you mentioned above, so if you could pull the latest changes from here and test it that would be much appreciated. If you find it in order I will release the changes in mimseq v1.1.8 next week. I did find another example of tRX-Gln-CTG-3 which is the parent and unsplit from tRNA-Gln-CTG-10. With the new fix, the cluster is correctly named tRX-Gln-CTG-3/tRNA10, so hopefully it works for you too.
All seems to be fixed now. Thanks!
When the 'parent' is a tRX gene and the isodecoders cannot be separately quantified, the isodecoder name is incorrect.
See below for an example ('Homo_sapiens_tRX-Gly-CCC-2' & 'Homo_sapiens_tRNA-Gly-CCC-8') from 3 separate runs of
mimseq
on different sets of simulated data. I'm just showing the 1st, 2nd and last columns of the respectveIsodecoder_counts_raw.txt
files.The first two rows are from two separate runs of mimseq and show what happens when Homo_sapiens_tRX-Gly-CCC-2 and Homo_sapiens_tRNA-Gly-CCC-8 cannot be separately quantified. The isodecoder name is erroneously given as Homo_sapiens_tRX-Gly-CCC-2/8, which appears to denote the isodecoders as Homo_sapiens_tRX-Gly-CCC-2 and Homo_sapiens_tRX-Gly-CCC-8, not Homo_sapiens_tRNA-Gly-CCC-8.
The final two lines are from a mimseq run where the two tRNAs are separately quantified and so this issue doesn't arise
When this occurs in the opposite direction (when the parent is
_tRNA
and the child_tRX
), it's handled in https://github.com/nedialkova-lab/mim-tRNAseq/blob/7c1cd62bf3a30ea45425d5cccb6ce274697364c3/mimseq/mmQuant.py#L667 so I guess that's where this needs to be recified too?