callahantiff / OMOP2OBO

OMOP2OBO: A Python Library for mapping OMOP standardized clinical terminologies to Open Biomedical Ontologies
http://tiffanycallahan.com/OMOP2OBO_Dashboard
MIT License
80 stars 12 forks source link

Harmonise OMOP mappings with boomer or a boomer-like approach? #72

Open matentzn opened 1 year ago

matentzn commented 1 year ago

Currently, the mappings generated by omop2obo do not respect the semantic constraints of all participating ontologies (which makes some sense because of the significant negative impact on performance).

For example, Malignant melanoma of skin of external auditory canal (disorder) in OMOP is mapped to benign connective and soft tissue neoplasm in MONDO (among more than 1000 others) which is not ideal (unless I made a mistake when reading the omop2obo data), but could be weeded out using approaches from the "ontology merging" community, such as https://github.com/INCATools/boomer.

134294 SubClassOf Nothing

Is there any way to guarantee for the OMOP2OBO mappings that:

  1. applying the mapping does not lead to equivalents cycles involving more than 1 ID from any given ID space in OBO
  2. Merging the mappings with the ontologies does not lead to unsatisfiable classes (like above)
  3. There is a 1:1 mapping table that contains only the "best" mapping for between each OMOP id and OBO ontology?

This is hugely difficult issue,

callahantiff commented 1 year ago

Hey @matentzn -- confirming that I saw this and agree that it's a very important issue. Will need a few weeks to properly respond, but it's on my list!