Closed chunyuma closed 2 years ago
some more examples for @edeutsch - here are the 16 nodes in KG2.5.1c that have both biolink:Drug
and biolink:Disease
in their all_categories
:
match (n) where "biolink:Drug" in n.all_categories and "biolink:Disease" in n.all_categories return n.id, n.name, n.all_names
n.id | n.name | n.all_names |
---|---|---|
"OMIM:MTHU003378" | "Polyuria" | ["Osmotic diuresis", "diuretic", "Polyuria", "Polyuric state", "polyuria", "Diuretics", "Diuresis", "Diuretic"] |
"MONDO:0012860" | "thrombophilia due to protein C deficiency, autosomal recessive" | ["Thrombophilia, Hereditary, Due To Protein C Deficiency, Autosomal Dominant", "Thrombophilia, Hereditary, Due To Protein C Deficiency, Autosomal Recessive", "Protein C", "protein C", "Protein c", "THROMBOPHILIA DUE TO PROTEIN C DEFICIENCY, AUTOSOMAL RECESSIVE; THPH4", "thrombophilia due to protein C deficiency, autosomal recessive"] |
"DRUGBANK:DB10572" | "Corn" | ["corn", "Corn"] |
"UniProtKB:P22301" | "IL-10" | ["Seronegative rheumatoid arthritis", "Interleukin-10", "Rheumatoid arthritis flare up", "Rheumatoid arthropathies", "Monoarthritic rheumatoid arthritis", "interleukin-10 (human)", "Rheumatoid Arthritis", "Arthritis, Rheumatoid", "RHEUMATOID ARTHRITIS; RA", "Interleukin 10", "interleukin-10", "Seropositive rheumatoid arthritis", "Rheumatoid arthritis aggravated", "Spinal rheumatoid arthritis", "Rheumatoid arthropathy", "Rheumatoid arthritis", "rheumatoid arthritis", "interleukin 10", "IL-10", "interleukin 10 [Source:HGNC Symbol;Acc:HGNC:5962]"] |
"UniProtKB:P41159" | "LEP" | ["LEP", "Leptin", "Obesity due to congenital leptin deficiency", "leptin (human)", "leptin", "leptin [Source:HGNC Symbol;Acc:HGNC:6553]", "Leptin deficiency or dysfunction", "obesity due to congenital leptin deficiency"] |
"UniProtKB:P05305" | "ET-1" | ["AURICULOCONDYLAR SYNDROME 3; ARCND3", "auriculocondylar syndrome 3", "Endothelin 1", "endothelin 1 [Source:HGNC Symbol;Acc:HGNC:3176]", "endothelin-1 (human)", "endothelin 1", "ET-1", "endothelin-1", "Endothelin-1"] |
"UniProtKB:P15502" | "ELN" | ["elastin", "elastin (human)", "cutis laxa, autosomal dominant 1", "Elastin", "elastin [Source:HGNC Symbol;Acc:HGNC:3327]", "CUTIS LAXA, AUTOSOMAL DOMINANT 1; ADCL1", "autosomal dominant cutis laxa 1", "ELN"] |
"UniProtKB:P07225" | "PROS1" | ["Protein S", "Protein s", "vitamin K-dependent protein S", "protein S [Source:HGNC Symbol;Acc:HGNC:9456]", "thrombophilia due to protein S deficiency, autosomal recessive", "THROMBOPHILIA DUE TO PROTEIN S DEFICIENCY, AUTOSOMAL RECESSIVE; THPH6", "protein S", "PROS1", "vitamin K-dependent protein S (human)", "Vitamin K-Dependent Protein S"] |
"UniProtKB:P04040" | "CAT" | ["catalase", "acatalasia", "Acatalasemia", "catalase [Source:HGNC Symbol;Acc:HGNC:1516]", "catalase (human)", "Cat", "Acatalasia", "Catalase", "CAT"] |
"MONDO:0020169" | "rare disorder with ptosis" | ["Ophthalmic Solution", "Mechanical ptosis", "Ptosis", "Eye solution", "Myogenic ptosis", "Aponeurotic ptosis", "ptosis", "Eyelid ptosis", "rare disorder with ptosis", "Prolapse", "Ptosis of eyelid", "Levator dehiscence", "ptosis (disease)", "Blepharoptosis", "Ophthalmic Solutions", "Eye drop", "Paralytic ptosis"] |
"MONDO:0005800" | "hordeolum" | ["hordeolum", "Hordeolum externum", "hordeolum externum", "Hordeolum", "External hordeolum", "Stye"] |
"MONDO:0005683" | "brucellosis" | ["Brucella melitensis", "Brucellosis", "Infection due to Brucella abortus", "Brucella suis", "brucellosis", "Brucella abortus", "Brucella canis"] |
"MONDO:0018076" | "tuberculosis" | ["Tuberculous", "Primary tuberculous complex, unspecified examination", "tuberculosis", "Sequelae of tuberculosis", "Other primary progressive tuberculosis, confirmed by animal inoculation", "Tuberculosis NOS aggravated", "Tuberculous Abscess", "Other primary progressive tuberculosis, confirmed by bacterial culture", "Primary tuberculous complex, confirmed by bacterial culture", "Late effects of tuberculosis of other specified organs", "Mycobacterium tuberculosis", "Tuberculosis, postpartum", "Primary tuberculous infection, unspecified type, confirmed histologically", "Primary tuberculous complex", "Primary tuberculous complex confirmed histologically", "Mycobacterium tuberculosis infection reactivated", "Primary tuberculous infection, unspecified type, unspecified examination", "Chronic tuberculosis", "Active tuberculosis", "Tuberculosis", "Old tuberculosis", "Tuberculosis, antepartum", "Other primary progressive tuberculosis, confirmed histologically", "Primary tuberculous infection, unspecified type", "Primary tuberculous infection, unspecified type, confirmed by bacterial culture", "Primary tuberculous complex, tubercle bacilli found (in sputum) by microscopy", "Primary tuberculous complex, confirmed by animal inoculation", "Tuberculous abscess"] |
"UMLS:C0302361" | "Shigella sonnei" | ["Shigella sonnei"] |
"MESH:D000432" | "Methanol" | ["Methanol", "Methyl Alcohol", "METHYL ALCOHOL", "methanol"] |
"UMLS:C0393947" | "Cholinergic crisis" | ["Increased gastric tonus", "Cholinergic Agents", "Cholinergic syndrome", "cholinergic drug", "Cholinergic crisis"] |
@edeutsch @amykglen @chunyuma is this still relevant?
I checked some of them and this problem seems to be fixed in the latest version of kg2c now.
Awesome, closing then.
Hi @edeutsch, here are several examples for you to improve the NodeSynonymizer:
biolink:Disease
/biolink:PhenotypicFeature
/biolink:DiseaseOrPhenotypicFeature
but theall_categories
attribute has these categories. Some of them should not belong to one ofbiolink:Disease
/biolink:PhenotypicFeature
/biolink:DiseaseOrPhenotypicFeature
. These nodes can be queried by using DSL command:match (n) where ((n.category<>'biolink:Disease' and n.category<>'biolink:PhenotypicFeature' and n.category<>'biolink:DiseaseOrPhenotypicFeature') and ('biolink:Disease' in n.all_categories or 'biolink:PhenotypicFeature' in n.all_categories or 'biolink:DiseaseOrPhenotypicFeature' in n.all_categories)) return n.id, n.category, n.all_categories limit 25
. I list a few of them below for reference:biolink:Drug
/biolink:ChemicalSubstance
but theall_categories
attribute has these categories. These nodes can be queried by using DSL command:match (n) where ((n.category<>'biolink:Drug' and n.category<>'biolink:ChemicalSubstance') and ('biolink:Drug' in n.all_categories or 'biolink:ChemicalSubstance' in n.all_categories)) return n.id, n.category, n.all_categories limit 25
. I list a few of them below for reference:biolink:Disease
/biolink:PhenotypicFeature
/biolink:DiseaseOrPhenotypicFeature
andbiolink:Drug
/biolink:ChemicalSubstance
. These nodes can be queried by using DSL command:match (n) where (('biolink:Disease' in n.all_categories or 'biolink:PhenotypicFeature' in n.all_categories or 'biolink:DiseaseOrPhenotypicFeature' in n.all_categories) and ('biolink:Drug' in n.all_categories or 'biolink:ChemicalSubstance' in n.all_categories)) return n.id, n.category, n.all_categories limit 25
. I list a few of them below for reference: