Closed shizipo closed 1 month ago
Hey @shizipo!
I don't think this is possible with out-of-the-box with HGVS. However, the VRS-Python library (which is biocommons-adjacent) does include a translator
module that can ingest gnomAD-style variation descriptions and output them as HGVS strings (making use of the HGVS library under the hood). See https://github.com/ga4gh/vrs-python/blob/main/notebooks/getting_started/4_Exploring_the_AlleleTranslator.ipynb for more.
code snippet from @korikuzma
from biocommons.seqrepo import SeqRepo
from ga4gh.vrs.extras.translator import AlleleTranslator
from ga4gh.vrs.dataproxy import SeqRepoDataProxy
sr = SeqRepo(root_dir="/usr/local/share/seqrepo/latest")
seqrepo_dataproxy = SeqRepoDataProxy(sr)
allele_translator = AlleleTranslator(data_proxy=seqrepo_dataproxy)
gnomad_vcf = "15-49716528-A-G"
vo = allele_translator.translate_from(gnomad_vcf)
print(allele_translator.translate_to(vo, "hgvs"))
# ['NC_000015.10:g.49716528A>G']
Hi, it's reasonably straight forward to convert this into a g.HGVS:
from hgvs.dataproviders import uta
from hgvs.extras.babelfish import Babelfish
hdp = uta.connect()
bf = Babelfish(hdp, 'GRCh37')
gnomad_vcf = "15-49716528-A-G"
chrom, position, ref, alt = gnomad_vcf.split("-")
hgvs_g = bf.vcf_to_g_hgvs(chrom, int(position), ref, alt)
print(hgvs_g)
Output:
NC_000015.9:g.49716528A>G
The next step involves knowing what transcript is the MANE select. This is not currently available in HGVS, but I have raised an issue for it, see #747
You could do this yourself pretty easily though by downloading MANE CSV and comparing transcripts against that
Hope this answers your question
like "https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg19/15-49716528-A-G/mane_select/" ?
get "NC_000015.9(NM_001330293.1):c.911-1617T>C"