Closed erikyao closed 1 year ago
Andrew's comment in Dec 20 Meeting:
Yao's TODO: report no. of removed predications.
Total number of predications: $90,703,432$. This number is obtained after:
C0056207|3075
will be counted as in two predictions.N.B. Node normalizer lookups not performed yet.
Now consider predictions with replaced CUIs:
For posterity (and only if it's easy to generate), can you provide some summary of the 669,924 predications that would be discarded? For example, are most of those because aapp
got changed to gngm
or vice versa?
But regardless, I'm comfortable moving forward with the exact semtype matching...
SUBJECT_SEMTYPE | SUBJECT_SEMTYPE_NAME | count |
---|---|---|
hcro | Health Care Related Organization | 116904 |
fndg | Finding | 27473 |
dsyn | Disease or Syndrome | 24603 |
tisu | Tissue | 23088 |
genf | Genetic Function | 21068 |
lbpr | Laboratory Procedure | 18223 |
patf | Pathologic Function | 17526 |
gngm | Gene or Genome | 16133 |
bpoc | Body Part, Organ, or Organ Component | 10744 |
ortf | Organ or Tissue Function | 10507 |
OBJECT_SEMTYPE | OBJECT_SEMTYPE_NAME | count |
---|---|---|
patf | Pathologic Function | 37985 |
genf | Genetic Function | 36366 |
dsyn | Disease or Syndrome | 32800 |
ortf | Organ or Tissue Function | 26308 |
fndg | Finding | 26290 |
sosy | Sign or Symptom | 17063 |
gngm | Gene or Genome | 12314 |
aapp | Amino Acid, Peptide, or Protein | 12219 |
anab | Anatomical Abnormality | 11565 |
lbpr | Laboratory Procedure | 7657 |
predicate | count |
---|---|
LOCATION_OF | 221965 |
PROCESS_OF | 83473 |
AFFECTS | 61902 |
COEXISTS_WITH | 37521 |
TREATS | 33710 |
USES | 28081 |
CAUSES | 25475 |
PART_OF | 23851 |
AUGMENTS | 20763 |
ASSOCIATED_WITH | 17609 |
(subject_semtype, predicate, object_semtype)
)SUBJECT_SEMTYPE | SUBJECT_SEMTYPE_NAME | PREDICATE | OBJECT_SEMTYPE | OBJECT_SEMTYPE_NAME | count |
---|---|---|---|---|---|
hcro | Health Care Related Organization | LOCATION_OF | resa | Research Activity | 53532 |
hcro | Health Care Related Organization | LOCATION_OF | lbpr | Laboratory Procedure | 31750 |
hcro | Health Care Related Organization | LOCATION_OF | diap | Diagnostic Procedure | 23061 |
fndg | Finding | PROCESS_OF | humn | Human | 20781 |
dsyn | Disease or Syndrome | PROCESS_OF | humn | Human | 11115 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | fndg | Finding | 9506 |
tisu | Tissue | LOCATION_OF | aapp | Amino Acid, Peptide, or Protein | 9332 |
gngm | Gene or Genome | LOCATION_OF | genf | Genetic Function | 9061 |
genf | Genetic Function | PROCESS_OF | gngm | Gene or Genome | 8734 |
mobd | Mental or Behavioral Dysfunction | PROCESS_OF | humn | Human | 7135 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | patf | Pathologic Function | 6732 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | anab | Anatomical Abnormality | 6377 |
lbpr | Laboratory Procedure | USES | lbpr | Laboratory Procedure | 6187 |
aapp | Amino Acid, Peptide, or Protein | AUGMENTS | ortf | Organ or Tissue Function | 6134 |
lbpr | Laboratory Procedure | USES | aapp | Amino Acid, Peptide, or Protein | 5792 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | aapp | Amino Acid, Peptide, or Protein | 5454 |
blor | Body Location or Region | LOCATION_OF | fndg | Finding | 5253 |
topp | Therapeutic or Preventive Procedure | TREATS | dsyn | Disease or Syndrome | 4388 |
dsyn | Disease or Syndrome | PROCESS_OF | mamm | Mammal | 4315 |
bpoc | Body Part, Organ, or Organ Component | LOCATION_OF | dsyn | Disease or Syndrome | 4149 |
hcro
replacements (w/o semtype matching)CUI1 | concept_name1 | semantic_type_abbreviation1 | CUI2 | concept_name2 | semantic_type_abbreviation2 |
---|---|---|---|---|---|
C1552516 | Specialty Group | hcro | C0220961 | UMLS Metathesaurus | inpr |
C1516172 | Cancer Center | hcro | C1513817 | NCI-Designated Cancer Center | hcro |
C0237680 | Residential Care Institutions | hcro | C0035186 | Residential Facilities | hcro |
C0237680 | Residential Care Institutions | hcro | C0035186 | Residential Facilities | mnob |
C1609437 | Primary care clinic | hcro | C1552443 | Clinic / Center - Primary Care | mnob |
C1609437 | Primary care clinic | hcro | C1552443 | Clinic / Center - Primary Care | hcro |
C0872261 | repository | hcro | C3847505 | Repository | mnob |
C1306377 | Postoperative anesthesia care unit | hcro | C0034871 | Recovery Room | hcro |
C1306377 | Postoperative anesthesia care unit | hcro | C0034871 | Recovery Room | mnob |
C1552447 | radiology facility | hcro | C1610162 | Radiology Clinic/Center | mnob |
C1552447 | radiology facility | hcro | C1610162 | Radiology Clinic/Center | hcro |
C0013967 | Emergency Service, Hospital | hcro | C0562508 | Accident and Emergency department | hcro |
C0013967 | Emergency Service, Hospital | hcro | C0562508 | Accident and Emergency department | mnob |
C0338036 | Doctor's office | hcro | C0031834 | Physicians' Offices | hcro |
C0338036 | Doctor's office | hcro | C0031834 | Physicians' Offices | mnob |
C1546895 | GlaxoSmithKline | hcro | C1552903 | SmithKline Beecham | hcro |
C1619637 | Hospital Psychiatric Units | hcro | C0870667 | Psychiatric hospital unit | hcro |
C1619637 | Hospital Psychiatric Units | hcro | C0870667 | Psychiatric hospital unit | mnob |
C1546882 | NABI | hcro | C1552896 | NABI | hcro |
C1546858 | Abbott Laboratories | hcro | C1552881 | Abbott Laboratories | hcro |
C1546873 | Merieux | hcro | C1552891 | Merieux | hcro |
C1512798 | Institute for Cancer Prevention | hcro | C1140168 | NCI Thesaurus | inpr |
C1546884 | Novartis Pharmaceutical Corporation | hcro | C1552897 | Novartis Pharmaceutical Corporation | hcro |
C4699045 | Stroke Center | hcro | C1136323 | Logical Observation Identifiers Names and Codes | inpr |
C3845566 | Rehabilitation facility | hcro | C0034993 | Rehabilitation Centers | hcro |
C3845566 | Rehabilitation facility | hcro | C0034993 | Rehabilitation Centers | mnob |
Great, this all looks fine to move forward. Please proceed with the update!
Thank you for the confirmation!
A CUI can have multiple semantic types. E.g.
When a retired CUI is mapped to a multi-semantic-typed CUI, it's reasonable to match the semantic types for precise replacement. E.g.
However, Colleen also mentioned that some highly related semantic types should be considered as matched. E.g.
The
aapp
:arrow_right:gngm
match is also worth consideration.Colleen and I came to the idea that:
(C0021740,aapp)
:arrow_right:(C3539881,aapp)
.(C1705981,aapp)
:arrow_right:(C3539881,gngm)
@newgene @andrewsu do you have any idea on the matching conditions? Or shall we carry out exact matching for all replacement? How about explicitly whitelisting? Appreciate your thoughts!
Note that the replacement can occur to either subjects or objects, so the matching conditions may affect the semantic meaning of those involved predicates.