Georgetown-IR-Lab / QuickUMLS

System for Medical Concept Extraction and Linking
MIT License
369 stars 95 forks source link

Reducing RAM footprint + Adding preferred term output #67

Open fschlatt opened 3 years ago

fschlatt commented 3 years ago

The preferred term for every match is also returned (useful for normalizing terms in a text).

The RAM footprint is reduced by removing the sets in which the terms are accumulated. Alternatively, only a set of already saved terms is kept per concept. As a consequence, duplicate terms can be insereted into the simstring database, when 2 equal terms are included different UMLS concepts. As a fix, the duplicates from the simstring database are removed when matching

fschlatt commented 3 years ago

Hmm, after finding a bug in my code which removed excluded a good portion of terms from being included in the simstring DB, the RAM reduction isn't as much as I had hoped. A couple of things are fixed, but large UMLS sets are still not processable

CatalinaZ16 commented 3 years ago

Hi! Thanks for your help but I still have the RAM problem (with 20 GB). ¿Can u please tell me the UMLS configuration at the first step of installation? which of these did you choose?

Active Subset: excludes "legacy" sources that have not been updated for several years in the UMLS Metathesaurus. Level 0: contains vocabulary sources for which no additional license agreements are necessary beyond the UMLS license. Level 0 + SNOMED CT: contains all Level 0 sources and SNOMED CT. SNOMEDCT + SCTUSX: includes only SNOMED CT and the US Extension to SNOMED CT.

om35 commented 2 years ago

hello, thank you for your commit , but how we can get synonyms ans source (Snomed,MSH ...etc) for example : if we search acromegaly the UMLS api gave in output :

{'pageSize': 25, 'pageNumber': 1, 'result': {'classType': 'searchResults', 'results': [{'ui': 'C0001206', 'rootSource': 'MTH', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0001206', 'name': 'Acromegaly'}, {'ui': 'C0405578', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0405578', 'name': 'Gigantism and acromegaly'}, {'ui': 'C0393839', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0393839', 'name': 'Neuropathy in acromegaly'}, {'ui': 'C4038941', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C4038941', 'name': 'History of acromegaly'}, {'ui': 'C4552171', 'rootSource': 'MDR', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C4552171', 'name': 'Familial acromegaly'}, {'ui': 'C0410211', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0410211', 'name': 'Myopathy in acromegaly'}, {'ui': 'C1386090', 'rootSource': 'ICPC2ICD10ENG', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1386090', 'name': 'acromegaly; Marie'}, {'ui': 'C2048485', 'rootSource': 'MEDCIN', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C2048485', 'name': 'inactive acromegaly'}, {'ui': 'C1386089', 'rootSource': 'ICPC2ICD10ENG', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1386089', 'name': 'acromegaly; arthropathy (manifestation)'}, {'ui': 'C0271548', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0271548', 'name': 'Arthropathy associated with acromegaly'}, {'ui': 'C1274642', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1274642', 'name': 'Hypermelanosis due to acromegaly'}, {'ui': 'C1386088', 'rootSource': 'ICPC2ICD10ENG', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1386088', 'name': 'arthritis; acromegaly (manifestation)'}, {'ui': 'C0948390', 'rootSource': 'MDR', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0948390', 'name': 'Pre-surgical treatment of acromegaly'}, {'ui': 'C0411963', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0411963', 'name': 'Skeletal survey - acromegaly'}, {'ui': 'C0342356', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0342356', 'name': 'Ectopic GHRH secretion causing acromegaly'}, {'ui': 'C4304407', 'rootSource': 'SNOMEDCT_US', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C4304407', 'name': 'X-linked intellectual disability with acromegaly and hyperactivity syndrome'}, {'ui': 'C1543054', 'rootSource': 'LNC', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1543054', 'name': 'VA C&P exam.acromegaly note:Find:Pt:{Setting}:Doc:{Role}'}, {'ui': 'C4521132', 'rootSource': 'OMIM', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C4521132', 'name': 'ACROMEGALY DUE TO PITUITARY ADENOMA 1'}, {'ui': 'C0346302', 'rootSource': 'MTH', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0346302', 'name': 'Growth Hormone-Secreting Pituitary Adenoma'}, {'ui': 'C0271549', 'rootSource': 'ICPC2ICD10ENG', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C0271549', 'name': 'Renon-Delille'}, {'ui': 'C4012409', 'rootSource': 'OMIM', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C4012409', 'name': 'PITUITARY ADENOMA 2, GROWTH HORMONE-SECRETING'}, {'ui': 'C1545795', 'rootSource': 'LNC', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C1545795', 'name': 'VA C&P exam.acromegaly note'}, {'ui': 'C5214248', 'rootSource': 'LNC', 'uri': 'https://uts-ws.nlm.nih.gov/rest/content/2021AB/CUI/C5214248', 'name': 'VA C and P exam.acromegaly | {Setting} | Document ontology'}], 'recCount': 23}} how we can do this for the Georgetown-IR-Lab project please ?