Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

wrong transcript/connection on case report #2259

Closed 47KW closed 3 years ago

47KW commented 3 years ago

https://scout.scilifelab.se/cust002/F0037125/cce4d3ad01f80405137e886f6b26f519

The position in transcript HGNC NM_022089 should be c.2978 not as below, which is shown in Scout.

  | ATP13A2 | NM_022089 (ENST00000452699) (hgnc-primary) | c.2963C>T

dnil commented 3 years ago

Agree - and so does actually the transcript table.. That variant overview box pulls from ENSEMBLs mapping to RefSeq which is imperfect (especially for hg19).

Screenshot 2020-12-15 at 12 57 43
dnil commented 3 years ago

I had a call from @47KW, so can provide some more detail.

Part of the issues is the usual over ENSEMBL to RefSeq mapping. This will resolve when we switch to hg38, with higher quality transcript correlation and MANE etc.

But the other part is a couple of variant and case general report readability issues.

dnil commented 3 years ago

A third option, potentially doing a lot of work for something that will anyway likely resolve with changing reference:

northwestwitch commented 3 years ago

Working on this atm, I might have some questions as I go

northwestwitch commented 3 years ago

It might take some time as I want to test it locally but the EBI FTP service that we use to download the gene definitions in Scout is down : ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt

dnil commented 3 years ago

OK, so, @northwestwitch made a great start to this. The report will now clearly indicate that the transcript is actually ENSEMBL-derived. The transcript overview will also be much neater. We have a tentative plan for over-ruling VEP and having the native RefSeq transcript flagged as the primary one. But that will require another level of work, and be better done after some well earned holiday rest! To remember it, I'll reopen this, but reflag it from "bug" - as the most dangerous part is fixed now - to "enhancement" as getting the native RefSeq transcript, with the proper hgvs aa change, flagged primary will be lovely.

northwestwitch commented 3 years ago

I'm going to close this since we reached an agreement on the transcripts that are shown on variants pages and the report. Otherwise let's reopen it.