Closed Mchhu closed 9 months ago
Hi, this is due to a bug in pyhgvs, see https://github.com/counsyl/hgvs/issues/50
I made a pull request in March 2020 but it hasn't been accepted, there is no activity on the project and I think it has been abandoned. Using my code, I get:
In [4]: pyhgvs.parse_hgvs_name('NM_021619.3:c.1071_1076dup', genome, get_transcript=factory.get_transcript_grch37)
Out[4]: ('NC_000009.11', 133556992, 'T', 'TCGCCGC')
In [5]: pyhgvs.parse_hgvs_name('NM_181523.3:c.1299+80_1425+64dup', genome, get_transcript=factory.get_transcript_grch37)
Out[5]:
('NC_000005.9',
67589388,
'A',
'AGATGAATTACATGTAATCAAGTCTAAAAAACTTGACACTCGTAATTACATAATTGCAATTTTAAAGATGTTTCCATGTCAGCTATTTTGTTAAACAATTGTTATTTGATTAAATACCTTATCCATTGAATTTATTTTAATCTTTCTAGGATCAAGTTGTCAAAGAAGATAATATTGAAGCTGTAGGGAAAAAATTACATGAATATAACACTCAGTTTCAAGAAAAAAGTCGAGAATATGATAGATTATATGAAGAATATACCCGCACATCCCAGGTGAGTTTTCTATGAAAATCAGATTAAAAAATAAGAGTTCTAAACTTTTAAAGACTAACATG')
I think your options are:
python3 -m pip install git+https://github.com/SACGF/hgvs#egg=pyhgvs
- warning: I have abandoned my fork as I have moved to Biocommons HGVS
Using the example code to convert HGVS to VCF, when I put in a HGVS transcript with a duplication, the output for the ref and alt is incorrect. I have tested this with other HGVS nomenclature and this seems to only be affect variants that has a duplication. Below are a couple of examples with the input and outputs. I'm not sure what the problem is here.
Input: NM_021619.3:c.1071_1076dup Output: ('NC_000009.11', 133557022, 'C', 'C')
Input: NM_181523.3:c.1299+80_1425+64dup Output: ('NC_000005.9', 67589390, 'A', 'A')