EnvironmentOntology / environmental-exposure-ontology

Modular environmental exposures ontology
Other
32 stars 18 forks source link

Issues created by ExO and NCIT OBO imports #161

Open dillerm opened 3 years ago

dillerm commented 3 years ago

I was looking at the overall taxonomy of ECTO and the reuse of ExO and NCIT classes seems unnecessary and may even detract from the quality of the ontology more than they add to it.

The two main issues that are introduced by ExO are a rather disjointed class hierarchy (since ExO does not use BFO for its top level hierarchy) and circularity (since the four top-level ExO classes are all defined in terms of each other). I suspect the former is why 'exposure event or process' isn't simply 'exposure process'. Another potential problem that arises from this disjointedness is that related-via-exposure-to and each of its subproperties conflict with the textual definition for causally-related-to, which is supposed to act more narrowly as a relationship between occurrents/processes or between material entities. Some of these issues look like they can be done away with by removing 'exposure event' and any assertions between it and other ECTO classes. From what I can tell, you would lose little, if anything at all, from doing this.

NCIT, from which 82 classes are imported, creates additional issues outside of adding more disjointedness to the hierarchy. For one, many of the imported classes are redundant (e.g., 'Behavior' (NCIT:C16326), 'Personal behavior' (NCIT:C19683), and 'behavior' (GO:0007610)). And because these classes are then reused to define various exposure process/event classes, the redundancy is proliferated (e.g., 'exposure to personal behavior' and 'exposure to behavior'). There are even redundancies between the NCIT classes themselves (e.g., 'Smoking' and 'Smoking behavior'). The confusion this might create for human users of the ontology is rather apparent upon examining the 19 or so classes in ECTO that are related to smoking and tobacco use. The other issue is the abundance of classification errors that NCIT comes with, such as 'Unemployment' (NCIT:C75563) being a subclass of 'Employment' (NCIT:C25172), that would almost certainly negatively impact a reasoner's performance if not lead to contradictions outright.

cmungall commented 3 years ago

these are all good points

we should use a top level class from COB https://github.com/OBOFoundry/COB/issues/43 in place of the exo root

behavior is more challenging, NBO is not appropriate here. I also recommended NCIT activities here: https://github.com/obo-behavior/behavior-ontology/issues/109

but you point out some issues we need to be aware of. Perhaps we can make tickets here https://github.com/NCI-Thesaurus/thesaurus-obo-edition/issues and hope for changes and then wrangle a suitable subset as imports?

I put in a cob ticket for behavior:

https://github.com/OBOFoundry/COB/issues/156

dillerm commented 3 years ago

It looks like @matentzn is a step ahead of us (https://github.com/obo-behavior/behavior-ontology/issues/109) on the overlapping classes between NBO and GO, which should address some of the problems in ECTO with respect to behavior classes.

diatomsRcool commented 3 years ago

I agree with using the COB root instead of the ExO root and I can make that change as soon as COB is available. I will also use a COB behavior term when that is available. NCIT is more of a challenge for me. I'm not sure what to do other than make issues for them as @cmungall suggests.

dillerm commented 3 years ago

@cmungall I went ahead and reached out to NCIT about the classification error between 'Unemployment' and 'Employment' and they have since corrected it (they were surprisingly fast about it). I'm not sure when it will appear in NCIT OBO, if it doesn't already. Any idea of which subset of NCIT you're interested in keeping moving forward?