ClinGen / gene-and-variant-curation-tools

ClinGen's gene and variant curation interfaces (GCI & VCI). Developed by Stanford ClinGen team.
https://curation.clinicalgenome.org/
MIT License
3 stars 1 forks source link

Support individual entries for ClinVar haplotypes in the VCI #264

Open cgpreston opened 1 year ago

cgpreston commented 1 year ago

There are a few haplotypes in ClinVar that (generally monogenic) VCEPs wish to curate on. The current ClinVar model points equally to both the ID for the haplotype and for the first variant, and this is manifesting in the VCI as merging the haplotype with the individual variants, which blocks them from curating those variants.

Discussed solution on the Variant Curation Working Group Call 11/4/22:

  1. Develop a SOP for curation strategies for these variants (Jenny G is looking into this).
  2. If a separate entry is desired for the haplotype (from the individual variants) in the VCI based on the finalized SOP, talk to ClinVar to see if the ClinVar data model could be modified such that they could provide distinct variant IDs so the VCI could have 3 entries for a diploid in cis variant (one for the haplotype, one for each variant).
    • The SOP will need to note that the VCI will NOT be creating a 'haplotype curation' experience, rather will only be allowing the haplotype to be an individual entry on the dashboard, with no other curation support.
  3. Check with the ERepo to see if they will have any issues with publishing these.
  4. Given that there are very few haplotypes that can be curated using the VCI and following the standard ACMG guidelines used for individual variants, it may make the most sense to have groups do these manually.

NOTEs from Gloria on the technical issues with the ClinVar response:

Example: Haplotype HGVS is NM_000152.5(GAA):c.[752C>T;761C>T].

Variant 752C>T 761 C>T [752 C>T;761C>T]
ClinVar ID 325781 325782 1321358
CAR ID CA8815007 CA8815009 CA2573102892

In the instance where a user enters two different CAID in VCI - CA8815007 and CA2573102892 these give back different ClinVarIds. But when fetch clinvar data using those ClinVarIds, the same CAID CA8815007 is returned. Since VCI takes ClinVar data over Allele Registry data, the CA8815007 id is used/saved.

https://reg.genome.network/redmine/projects/registry/genboree_registry/by_caid?caid=CA8815007 has ClinVar Id 325781 https://reg.genome.network/redmine/projects/registry/genboree_registry/by_caid?caid=CA2573102892 has ClinVar Id 1321358

Fetch data by ClinVar Id as following, see same CAID.

XRefList> XRef ID="CA8815007" DB="ClinGen"/>

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?api_key=986bd80f43a3ba6ec1bc7a50e7bda60c1b09&db=clinvar&rettype=vcv&is_variationid&from_esearch=true&id=325781

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?api_key=986bd80f43a3ba6ec1bc7a50e7bda60c1b09&db=clinvar&rettype=vcv&is_variationid&from_esearch=true&id=1321358

So for some reason, same CA8815007 id is associated with both variants.