biothings / biomedical_id_resolver.js

js library for converting biological ids in batch
Apache License 2.0
1 stars 2 forks source link

discuss: change CLINVAR id from RCV to Variant ID #69

Closed colleenXu closed 3 years ago

coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 1140665613


Totals Coverage Status
Change from base Build 1137355909: 0.0%
Covered Lines: 439
Relevant Lines: 444

💛 - Coveralls
newgene commented 3 years ago

@colleenXu what's the justification to switch to using ClinVar variant_id instead of RCV id? Is that the primary ID for Clinvar in Biolink model? Just need a description to document why we made this change.

I also suggest to remove dbnsfp.clinvar.clinvar_id, clinvar field should be the canonical source for ClinVar.

colleenXu commented 3 years ago

@newgene There are many identifiers in Clinvar, and it's not clear to me which ID the biolink-model is referring to with the CLINVAR prefix. See the id_prefixes here (wait...there's CLINVAR and ClinVarVariant? What are these?).

Based on my reading on Clinvar identifiers and this FAQ on RCV, it seemed that RCV were more like an association's ID (phenotype/disease + variant).

So I thought maybe a different ID from clinvar refers to the SequenceVariant itself and should be used...

According to the linked Clinvar doc on identifiers, the closest to a "unique variant ID" is maybe a "VariationID" or "AlleleID"?

newgene commented 3 years ago

OK, if this does not link to a known issue or expected use case, I would hold off this one.

We are making changes to ClinVar source at MyVariant.info now: https://github.com/biothings/myvariant.info/issues/127. Will reevaluate this after this myvariant issue is closed.

colleenXu commented 3 years ago

Based on the conversation in NCATSTranslator Slack, the CLINVAR ID prefix for SequenceVariant refers to the "Variation ID" in Clinvar.

My recommendation is to go forward with this PR, once I add a commit to remove the dbnsfp field mapping and resolve merge conflicts...

I've confirmed that the clinvar.variant_id field in MyVariant corresponds to the Clinvar Variation ID. For example, see how the clinvar.variant_id:596318 corresponds to the record here that states that the Clinvar Variation ID is also 596318