varfish-org / mehari

VEP-like tool for sequence ontology and HGVS annotation of VCF files
MIT License
14 stars 1 forks source link

unannotated MT variants for RefSeq #441

Open stolpeo opened 2 months ago

stolpeo commented 2 months ago

Describe the bug We noticed that there are MT variants that have been annotated with Ensembl information but the RefSeq information is empty. We need to double check why this is. For example:

chrMT:m.11778G>A

holtgrewe commented 2 months ago

Refseq does not have chrmt transcripts. We should write out ensembl transcripts in this case also to the refseq column...

Further, our tx data is not on the latest cdot yet.

This is the Reev output.

https://reev.cubi.bihealth.org/seqvar/grch37-MT-11778-G-A?orig=GRCh37-MT-11778-G-A

We need an update to the tx data and an update to mehari to write out ensembl TX also for flrefseq.

holtgrewe commented 2 months ago

It may be counterintuitive, but chrmt variants are annotated on m. coordinates only. Note that in jannovar we annotated with the gene name as the symbol.

https://hgvs-nomenclature.org/stable/background/refseq/

Clinvar agrees

https://www.ncbi.nlm.nih.gov/clinvar/RCV000144018/

We can try to get sagcf/cdot to annotate this or hand curate chrmt transcripts for refseq.

A first solution would be to write out all transcripts to refseq even enst ones and this can be implemented as a bug fix by @tedil for example. Should be almost trivial.

holtgrewe commented 2 months ago

https://github.com/varfish-org/mehari-data-tx/pull/37

That's the best I can do on mobile.

holtgrewe commented 2 months ago

OK, but the data release to automerge.

https://github.com/varfish-org/mehari-data-tx/pull/36

This should hopefully enable prediction for all chrMT transcripts. Please try out.

@gromdiom could you deploy the new mehari-data-tx release to reev?