cancerDHC / umls-rrf-scala

A very basic library for parsing files in the UMLS RRF format
MIT License
4 stars 2 forks source link

Investigate an instance where we didn't match a SNOMED term correctly through UMLS #18

Open gaurav opened 3 years ago

gaurav commented 3 years ago

David Clunie noted an instance where we didn't make a match through the UMLS but were able to find it through the EBI OLS:

I was interested to see that some of the SNOMED FMA anatomy matches were not apparently found in UMLS, even though the EBI Ontology Lookup Service matched them. Not sure if this has anything to do with DICOM's choice of "structure of" rather than "entire" SNOMED codes for anatomy (although I think UMLS matches the "structure of" rather than "entire" usually also). Or it may be that your UMLS search was for some reason incomplete. E.g., SCT:34625003 (structure of) "medial common iliac lymph node" maps to UMLS:C0229808, which maps to FMA:16641, which your matching did not pick up from UMLS, only EBO. This may be confounded by the node "group" concept in SNOMED and FMA (which DICOM does not use); i.e., SCT:245298008 "Medial common iliac lymph node group (body structure)" is also mapped to UMLS:C0229808 in the UMLS metathesaurus (and FMA:71820).

I suspect that one of these mappings might have been missing in the 2019 UMLS release that we've been using, so it might work correctly once we start using the 2020 UMLS release (#22). If not, this issue is for digging deeper into whether this is a result of the structure-vs-group distinction in mapping or caused by something else.