Closed 47KW closed 3 years ago
Agree - and so does actually the transcript table.. That variant overview box pulls from ENSEMBLs mapping to RefSeq which is imperfect (especially for hg19).
I had a call from @47KW, so can provide some more detail.
Part of the issues is the usual over ENSEMBL to RefSeq mapping. This will resolve when we switch to hg38, with higher quality transcript correlation and MANE etc.
But the other part is a couple of variant and case general report readability issues.
[x] General report formatting: we should be more clear about the origin of the "RefSeq" transcript number in there. As it is now, the variant that is pinned in this case looks good on the transcript table. It also gets the canonical coord on the case page when pinned, since we use the hgvs there. On the report we use what VEP and ENSEMBL think is the best refseq mapping, but we just show the ENSEMBL-to-RefSeq transcript number without mentioning that is what it is.
Suggestions include that we in the report could 1) use the actual refseq transcript from the annotation 2) note that the transcript name is really a best-guess from ensembl or 3) use the gene symbol + hgvs like in pretty_link_variant
.
[ ] In the transcript overview we could consider trying to 1) try use the one from the "real" refseq transcript, or flag if they differ or 2) at least show the ENSEMBL ids as solid black and the NM-ones as gray guesstimates..
A third option, potentially doing a lot of work for something that will anyway likely resolve with changing reference:
Working on this atm, I might have some questions as I go
It might take some time as I want to test it locally but the EBI FTP service that we use to download the gene definitions in Scout is down : ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt
OK, so, @northwestwitch made a great start to this. The report will now clearly indicate that the transcript is actually ENSEMBL-derived. The transcript overview will also be much neater. We have a tentative plan for over-ruling VEP and having the native RefSeq transcript flagged as the primary one. But that will require another level of work, and be better done after some well earned holiday rest! To remember it, I'll reopen this, but reflag it from "bug" - as the most dangerous part is fixed now - to "enhancement" as getting the native RefSeq transcript, with the proper hgvs aa change, flagged primary will be lovely.
I'm going to close this since we reached an agreement on the transcripts that are shown on variants pages and the report. Otherwise let's reopen it.
https://scout.scilifelab.se/cust002/F0037125/cce4d3ad01f80405137e886f6b26f519
The position in transcript HGNC NM_022089 should be c.2978 not as below, which is shown in Scout.
| ATP13A2 | NM_022089 (ENST00000452699) (hgnc-primary) | c.2963C>T