Closed amykglen closed 1 year ago
ok, so in the new synonymizer, it's become clear that this conflation (type 2 diabetes with coronary artery disease) is coming from the SRI:
Cluster for MONDO:0005148 has 26 nodes:
id | category | name | in_SRI | in_KG2pre | is_cluster_rep |
---|---|---|---|---|---|
DOID:9352 | Disease | type 2 diabetes mellitus | X | X | |
EFO:0001360 | Disease | obsolete_type II diabetes mellitus | X | X | |
HP:0005978 | Disease | Type II diabetes mellitus | X | X | |
ICD10:E11 | Disease | X | |||
KEGG.DISEASE:04930 | Disease | X | |||
MEDDRA:10012611 | Disease | X | |||
MEDDRA:10012613 | Disease | X | |||
MEDDRA:10026947 | Disease | X | |||
MEDDRA:10029402 | Disease | X | |||
MEDDRA:10029505 | Disease | X | |||
MEDDRA:10045242 | Disease | X | |||
MEDDRA:10067585 | Disease | X | |||
MESH:D003924 | Disease | Diabetes Mellitus, Type 2 | X | X | |
MONDO:0005148 | Disease | type 2 diabetes mellitus | X | X | X |
NCIT:C26747 | Disease | Type 2 Diabetes Mellitus | X | X | |
OMIM:125853 | Disease | Type 2 diabetes mellitus related phenotypic feature | X | X | |
SNOMEDCT:44054006 | Disease | X | |||
UMLS:C0011860 | Disease | Diabetes Mellitus, Non-Insulin-Dependent | X | X | |
UMLS:C1840169 | Disease | CORONARY ARTERY DISEASE, SUSCEPTIBILITY TO | X | X | |
UMLS:C1852091 | Disease | INSULIN RESISTANCE, SUSCEPTIBILITY TO | X | X | |
UMLS:C2674662 | Disease | PON1 ENZYME ACTIVITY, VARIATION IN | X | X | |
UMLS:C2674663 | Disease | ORGANOPHOSPHATE POISONING, SUSCEPTIBILITY TO | X | X | |
UMLS:C2674665 | Disease | MICROVASCULAR COMPLICATIONS OF DIABETES, SUSCEPTIBILITY TO, 5 (finding) | X | X | |
UMLS:C3149706 | Disease | CORONARY ARTERY SPASM 2, SUSCEPTIBILITY TO | X | X | |
UMLS:C4017238 | Disease | TYPE 2 DIABETES MELLITUS, PROTECTION AGAINST | X | X | |
UMLS:CN244395 | Disease | X |
(we don't do any re-clustering of SRI nodes, so the fact that these nodes are 'in_SRI' means they assigned them to this cluster, with the other listed SRI nodes)
I wrote this up in: https://github.com/TranslatorSRI/NodeNormalization/issues/189
I'm gonna close this issue since it's in the SRI's hands
noticed this conflation while working on RTXteam/RTX-KG2#210:
https://arax.ncats.io/?term=DOID:9352
in particular, these nodes from the above page don't seem like they belong in the type 2 diabetes synonym cluster:
there are other very questionable concepts in this cluster as well (like
MICROVASCULAR COMPLICATIONS OF DIABETES, SUSCEPTIBILITY TO, 5 (finding)
), but the four in the above table are clearly incorrect.the conflation doesn't seem to be due to
same_as
edges based on a quick look at the KG2pre neo4j... so I'm not sure where it's coming from. maybe the SRI normalizer?