TranslatorSRI / Babel

Babel creates cliques of equivalent identifiers across many biomedical vocabularies.
MIT License
8 stars 2 forks source link

Add PharmGKB mappings for genes, diseases, variants, and chemicals #116

Open gaurav opened 1 year ago

gaurav commented 1 year ago

As requested at https://github.com/TranslatorSRI/NodeNormalization/issues/171:

To incorporate the PharmGKB API as described in https://github.com/biothings/BioThings_Explorer_TRAPI/issues/556, we need node normalizer to resolve PharmGKB identifiers for genes, diseases, variants, and chemicals. Examples of each:

Gene: BRAF - https://www.pharmgkb.org/gene/PA25408 Disease: leukemia - https://www.pharmgkb.org/disease/PA444750 Variant: rs113488022 - https://www.pharmgkb.org/variant/PA166157522 Chemical: imatinib - https://www.pharmgkb.org/chemical/PA10804

Mappings can be found in the "Primary Data" section of https://www.pharmgkb.org/downloads

Related issue to add prefixes to biolink model: https://github.com/biolink/biolink-model/issues/1236

Order of work:

andrewsu commented 10 months ago

Just wanted to check on the status of this request. Any update? (For convenience, just noting some example tests...)

https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PHARMGKB.GENE:PA25408 https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PHARMGKB.CHEMICAL:PA10804 https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PHARMGKB.DISEASE:PA444750 https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PHARMGKB.PATHWAYS:PA2023 https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PHARMGKB.VARIANT:PA166157522

gaurav commented 8 months ago

Looking at their downloads, it looks like genes and chemicals (and drugs) have cross-mappings with identifiers already in NodeNorm, so those should be relatively easy to ingest. Variants and diseases are going to be trickier: variants don’t have cross-mappings with OMIM, which is the only source of variants currently in NodeNorm (I think), and diseases/phenotypes don’t have cross-mappings with MONDO and HP, which is what we most rely on (particularly for autocomplete). So I’m concerned about ingesting those without knowing how to merge it into existing variants and disease/phenotypes could get tricky.

gaurav commented 8 months ago

A lot of the MESH mappings they have connect to cliques already in Babel (even if they don't always connect with MONDO), so adding these mappings shouldn't make things worse.