RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

Synonymization error for UniProtKB:P18509 (PACAP) #1165

Closed amykglen closed 3 years ago

amykglen commented 3 years ago

Seems there's conflation of at least two concepts for UniProtKB:P18509: pituitary adenylate cyclase-activating polypeptide/pancreatic and cerebellar agenesis/pancreas transcription factor 1, alpha subunit.

Screen Shot 2020-12-14 at 9 34 10 AM

Info on its coalesced nodes from KG2.3.4:

match (n) where n.id in ["HGNC:241", "UMLS:C1428235", "UMLS:C0084057", "UMLS:C1314954", "UMLS:C0071163", "UMLS:C1836780", "MESH:C563796", "OMIM:609069", "OMIM:607194", "PR:000003761", "ORPHANET:65288", "MONDO:0012192", "PR:P18509", "UniProtKB:P18509", "ENSEMBL:ENSG00000141433", "CHEMBL.TARGET:CHEMBL5692", "NCBIGene:116"] return n.id, n.name, n.iri
n.id n.name n.iri
"HGNC:241" "adenylate cyclase activating polypeptide 1" "https://identifiers.org/hgnc:241"
"UMLS:C1428235" "Pancreas transcription factor 1, alpha subunit" "https://identifiers.org/umls:C1428235"
"UMLS:C0084057" "Pituitary Adenylate Cyclase-Activating Polypeptide" "https://identifiers.org/umls:C0084057"
"UMLS:C1314954" "Pituitary Adenylate Cyclase-Activating Polypeptide" "https://identifiers.org/umls:C1314954"
"UMLS:C0071163" "Pituitary Adenylate Cyclase-Activating Polypeptide" "https://identifiers.org/umls:C0071163"
"UMLS:C1836780" "Pancreas transcription factor 1, alpha subunit" "https://identifiers.org/umls:C1836780"
"MESH:C563796" "Diabetes Mellitus, Permanent Neonatal, with Cerebellar Agenesis" "http://purl.bioontology.org/ontology/MESH/C563796"
"OMIM:609069" "Pancreatic and cerebellar agenesis" "http://purl.obolibrary.org/obo/OMIM_609069"
"OMIM:607194" "Pancreas transcription factor 1, alpha subunit" "http://purl.obolibrary.org/obo/OMIM_607194"
"PR:000003761" "pituitary adenylate cyclase-activating polypeptide" "http://purl.obolibrary.org/obo/PR_000003761"
"ORPHANET:65288" "Permanent neonatal diabetes mellitus - pancreatic and cerebellar agenesis" "http://purl.bioontology.org/ontology/ORDO/65288"
"MONDO:0012192" "permanent neonatal diabetes mellitus-pancreatic and cerebellar agenesis syndrome" "http://purl.obolibrary.org/obo/MONDO_0012192"
"PR:P18509" "pituitary adenylate cyclase-activating polypeptide (human)" "http://purl.obolibrary.org/obo/PR_P18509"
"UniProtKB:P18509" "PACAP" "https://identifiers.org/uniprot:P18509"
"ENSEMBL:ENSG00000141433" "adenylate cyclase activating polypeptide 1 [Source:HGNC Symbol;Acc:HGNC:241]" "https://identifiers.org/ensembl:ENSG00000141433"
"CHEMBL.TARGET:CHEMBL5692" "Pituitary adenylate cyclase-activating polypeptide" "https://identifiers.org/chembl.target:CHEMBL5692"
"NCBIGene:116" "adenylate cyclase activating polypeptide 1" "https://identifiers.org/ncbigene:116"
edeutsch commented 3 years ago

This is also mildly affected by #1164, but not in a major way.

The primary cause is my practice of recording OMIM abbreviations as well as the names. but these are clashing with gene symbols too much. so I think I will solve this by stopping the indexing of OMIM abbreviations. there's probably a better way to do this, but will do this for now.

OMIM:609069 PANCREATIC AND CEREBELLAR AGENESIS; PACA    disease
OMIM:135900 COFFIN-SIRIS SYNDROME 1; CSS1   disease
OMIM:136120 FISH-EYE DISEASE; FED   disease
OMIM:136520 FOVEAL HYPOPLASIA 1; FVH1   disease
OMIM:615697 EPILEPSY, FAMILIAL TEMPORAL LOBE, 6; ETL6   disease
OMIM:104310 ALZHEIMER DISEASE 2; AD2    disease
etc.
amykglen commented 3 years ago

appears resolved in KG2.6.7.1:

pancreatic and cerebellar agenesis: https://arax.ncats.io/?term=MESH:C563796 PACAP/ADCYAP1: https://arax.ncats.io/?term=UniProtKB:P18509

(I think PACAP and ADCYAP1 are equivalent for our purposes)