Closed arq5x closed 7 years ago
While the rsId could be used to grab genome coordinates, only 277 of 321 have an rsId assigned. Adding the explicit genome coordinate would be very helpful!
I have attached a table with the transcript IDs used throughout this manuscript (tab-delimited text). I'll also add it to the git repo. Sorry for this oversight, James
In line 283 of LQTS_variants.txt, you are missing the transcript HGVS notation for the variant. You only provide the protein coordinate. Any chance for a fix? To clarify it would be in the field you call "Coding".
Good spot. It looks like that is an odd one. Much of the data in these analyses was from HGMD, but this does not have an HGMD accession and so we think came from an LSDB. We have a citation to this paper (http://emboj.embopress.org/content/18/15/4137.long), but that appears to be a functional study, rather than a report of an observation in humans. I apologise that we can't retrieve the primary source for the association with disease in humans.
It doesn't look like there is a simple substitution for p.E43N. If it is a real variant, it should be an indel of the complete codon from GAG to AAT/AAC c.127_129delGAGinsAAT / c.127_129delGAGinsAAC.
Given the ambiguity I won't commit this data to the repo, but hope that this helps clarify.
Do you happen to have the genome coordinates (and genome build) for the LQTS benign and pathogenic variants you used? Currently all that is reported in Table S1 is a partial HGVS description of the coding position of the variant. However, this can only be unambiguously placed on the reference genome if one also knows the transcript of to which the coding position refers. This info is lacking from Table S1. As such, updating this file with reference genome coordinates would make your work much more accessible and allow others to more broadly use your results. Could you do this?