EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
136 stars 53 forks source link

Mixture terms require new imports #1006

Open laurenechan opened 4 years ago

laurenechan commented 4 years ago

Hello, @diatomsRcool and I are interested in developing some mixture terms in ENVO to use for exposure modeling in ECTO that require further imports from ChEBI, and one term from FoodOn. Those imports include these terms:

CHEBI:29021 | hexane CHEBI:43098 | heptane CHEBI:28798 | rubber particle CHEBI:17824 | isopropanol CHEBI:15347 | acetone CHEBI:17578 | toluene CHEBI:29362 | ethylene CHEBI:46916 | vinyl acetate CHEBI:53448 | methylcellulose CHEBI:13643 | glycol CHEBI:33101 | nitrogen dioxide CHEBI:155903 | copper phthalocyanine CHEBI:60766 | polyacrylamide polymer FOODON:03302776 | soybean oil CHEBI:26130 | pigment CHEBI:17824 | propan-2-ol CHEBI:17790 | methanol CHEBI:16997 | propane-1,2-glycol CHEBI:32234 | titanium dioxide CHEBI:15368 | Acrolein CHEBI:39478 | 1,3-Butadiene CHEBI:16842 | Formaldehyde CHEBI:30751 | Formic acid CHEBI:17790 | Methanol CHEBI:17578 | Toluene CHEBI:51084 | Inorganic nitrates CHEBI:24840 | Inorganic sulfates CHEBI:26836 | sulfuric acid

diatomsRcool commented 4 years ago

Not sure what the process is here. We need these imported so we can use them in axiomatic definitions. Rubber cement = manufactured product and has part some X and has part some X....etc. The imports above would be the X

kaiiam commented 4 years ago

Hey @laurenechan and @diatomsRcool, perhaps we could use a Dead Simple Ontology Design Pattern (DOSDP). See the ENVO entity attribute csv and yaml files for an example.

Using this we could produce compositional terms like manufactured X. Can you be more specific about the axioms you want, e.g. are you just going to have one (or a set number of) has part some X axiom per term? If so a DOSDP would be ideal, if not (and there are multiple possible has part relationships) we could employ a Robot template.

Let me know what you might need these terms to look like and we can plan accordingly.

laurenechan commented 4 years ago

Hi @kaiiam , here are some of the things we were hoping to be able to encompass in the mixture terms:

The ultimate goal from our use cases is to be able to identify if organism A is exposed to a variety of mixtures, environments, etc., and organism B is exposed to a completely different variety of mixtures etc, yet they have some similar disease/phenotype/outcome, what are the individual components each organism is exposed to (parts of mixtures) that may be similar and potentially is related to the outcome seen.

If we have inconsistent numbers of components in each mixture, can we use a similar DOSDP to what you have for the ENVO entity attributes you indicated?

kaiiam commented 4 years ago

Hey @laurenechan I'm wondering if you can leverage owl equivalence axioms to allow for entailment and inference from the reasoner. I'd suggest you take a look at Chris mungall's blog post on rector-normalization (the rest of the blog is also a treasure trove which I'm slowly working through).

The long and short of it is that when you have equivalence classes that expresses conditions, e.g., class 1 has properties A, B and C, and class 2 has properties A and B, then the reasoner can figure out that class 1 should be subclass to class 2. Similarly here, mixtures will have sets of parts, if you list each out as a sum of it's parts within owl equivalence class axioms, (as per @diatomsRcool's example: e.g., Rubber cement = manufactured product and has part some X1 and has part some X2....etc), you'll allow the reasoner to figure out the hierarchy of mixture classes. I did a little demo similar to this idea here just look at that comment (don't worry about the rest of the thread).

My suggestion would be for you to just play around with this idea, you could use ODK to launch yourself a test application ontology, or even simpler just clone a "sandbox" version of ENVO to experiment in (just don't make any weird pushes or pull requests from the sandbox version).

Let me know if these suggestions seem useful to you and if you have any questions, or want to try this out, and want my help doing so. It would be ideal if you (or we) can figure out some consistent owl equivalence for the axioms of these classes. If so we could later maybe implement them within design patterns such as those produced by Robot or DOSDP.

laurenechan commented 4 years ago

Hi, I got the chance to do a little sandbox building for these mixtures and developed this file for use with Robot to create the terms I am in need of. (first tab here) Two terms (6 and 10) within this spreadsheet are already requested terms that I think were accepted on a PR, but haven't been released yet to be searchable and this will add logical axioms for their mixture components.

If ENVO is open to this option, I would be interested in adding these terms into ENVO for us to reference in ECTO terms.

@kaiiam @diatomsRcool

cmungall commented 4 years ago

Hi @laurenechan

I think robot is a decent best choice here as you have a variable number of members of the mixture, and the only way to do this in dosdp is with one pattern per cardinality (cc @matentzn )

your template is a great start. unfortunately we haven't yet documented the schema for mixtures in envo, some comments here:

Now I look at these more closely, I am not so sure a template buys you much. I think you may be better adding these the traditional way in Protege. You may also want to consider aligning the textual definition with axioms more. You may want to flesh out a hierarchy the traditional way. For example, you have various types of ink - I would just go ahead and add ink and add subclasses.

ddooley commented 4 years ago

I'm guessing the soybean oil mentioned above is food-grade soybean oil? I think the semantic wiggle room with a food product is that its manufacture meets health and organoleptic standards; whether it is actually used as food or goes to waste, or ends up in playdough or some other intention does not preclude referencing it.

matentzn commented 4 years ago

This is my mess, sorry guys.

I don't think all of these should be equivalence axioms. In fact I don't think any of them should be. You'd have to close the membership list (at least) for these to be valid equivalence axioms

Isnt this what you did @laurenechan - making sure that the list of compounds in the mixture is exhaustive? So lets say they are exhaustive (if not this is just wrong).

So what is your suggestion for these three cases:

1) A mixture A can be exhaustively defined as a mix of a set of compounds {C1, C2, C3}. 2) A mixture A can be partially defined as containing a set of compounds (and more, just not clear what) 3) A mixture A is contains (among others) one of two compounds, for example, "sugar or sweetener".

Is it {A sub: hasCompound some C1, A sub hasCompound some C2,...} for both 1 and 2? And A sub hasCompound some (C4 or C5) is considered too rococo?

don't include potential members with has-part. even if you weaken your axioms from EC to SC, it will still be an all-some axiom. potential members would be a some-some axiom

Absolutely. I thought @laurenechan checked that -> if not, revise this.. Potential members cant be included like this. Is there a reason for why your usecase would require potential members Lauren?

not clear the foodon class is correct here, as it bakes in the assumption the product is created with the intention of eating

Agree with @ddooley take on this.. Seems harmless enough to include?

Now I look at these more closely, I am not so sure a template buys you much. I think you may be better adding these the traditional way in Protege. You may also want to consider aligning the textual definition with axioms more. You may want to flesh out a hierarchy the traditional way. For example, you have various types of ink - I would just go ahead and add ink and add subclasses.

The robot template is only an intermediate artefact and will be merged into whatever ontology files takes the mixtures. It's just easier like that during collaboration and while sorting out the patterns. Lauren has already compiled it into OWL and checked it in Protege. I agree that once we have at least some rudimentary agreement about these terms, maybe its best to just assert the parts of the hierarchy that cant be safely inferred... For that to happen, we first need to know:

  1. what is the pattern for mixtures (see above), and
  2. is there any objection to adding these terms into ENVO.

@laurenechan the rest of the feedback here, like regarding the SIO terms and the CHEBI NTR, do you feel comfortable with dealing with these? I will help you with the rest.

kaiiam commented 4 years ago

@laurenechan

you could make a CHEBI ticket

Simpler yet just use the CHEBI submission tool to add the new terms yourself, it's not too difficult. You'll get a CHEBI id right away which will show up in their release version fairly soon after.

  1. what is the pattern for mixtures (see above), and
  2. is there any objection to adding these terms into ENVO.

Agreed this is what needs to be sorted out first, perhaps @cmungall, @pbuttigieg and @diatomsRcool can weigh in here. If not ENVO where should they go?

cmungall commented 4 years ago

If not ENVO where should they go?

I propose:

See

cmungall commented 4 years ago

not clear the foodon class is correct here, as it bakes in the assumption the product is created with the intention of eating Agree with @ddooley take on this.. Seems harmless enough to include?

Adding dependencies between ontologies is never harmless. See #945 and https://docs.google.com/document/d/1i0-mj_gY42h9Ko8ij4SQ4LvCAXKCRwXXSAFyNsbQomU/edit

I prefer cleaner more modular separation

laurenechan commented 3 years ago

Hi, Thanks for all of the thoughts! I've made the request for ethylene-vinyl acetate here And have replaced the SIO term with the ChEBI term polysaccharide as the two biopolymers used in gel ink (the term in question) are tragacanth and xanthan gum which are polysaccharides.

As for the 'soy ink' term, if we are ok trusting the wikipedia page it sounds like this actually might be a food grade soybean oil? The page states "To make soy ink, soybean oil is slightly refined and then blended with pigment, resins, and waxes. Even though soybean oil is an edible vegetable oil, soy ink is not edible or 100% biodegradable etc." If we don't want the dependency, I'm happy to request a new term for potentially not food grade soybean oil as well.

As for the other notes, this was intended to be exhaustive and does not include any "potential" components. For the remaining mixtures on this list, it sounds as though (other than ethylene-vinyl acetate) we are going to add to ENVO?

@matentzn @cmungall

laurenechan commented 3 years ago

A) I have weakened the rubber cement definition to subclass and removed any mentions of ChEBI roles. B) Ethylene-vinyl acetate has been accepted by ChEBI C) I will be requesting 'soybean oil' from ChEBI as it seems to have quite a few similar oils (essential oil, castor oil etc.), this reduces the dependency of the mixtures to just ENVO and ChEBI terms (EDIT: term requested 12/1/2020) D) I have changed the genus of the definitions from ChEBI: mixture to ENVO: Environmental Material E) For the remaining mixtures, the set of compounds exhaustively defines the mixture. Just to be clear, there are no 'potential' components F) Once we are in agreement that we can add the terms, I will use the template to generate OWL and then merge them into ENVO and delete the template

kaiiam commented 3 years ago

@laurenechan I'm not in a position to approve this but once points A) - E) are settled by the senior editors @pbuttigieg and @cmungall, I have created a ENVO-Robot-template-and-merge-workflow workflow which you could leverage. The steps for ontology engineers section, and the robot template strings in the example google sheet are relevant to you, (i.e. how to merge a template into ENVO without creating a big mess, see https://github.com/EnvironmentOntology/envo/pull/1043).

matentzn commented 3 years ago

After talking to @cmungall about the modelling in this PR, I now understand my modelling error. The problem is that modelling mixtures as an intersection (a sequence of AND statements) of existential restrictions (has part some mixture part) is only half of the equation - what is missing is "closure". Before I explain this @laurenechan sorry that this very general discussion here is being had on your ticket - this has nothing to do with your terms, and is a general discussion on a pattern.

Right now, when I say:

# Modeling solution A:
coffee = mixture that has part some coffee beans and has part some water
tiramisu = mixture that has part some coffee beans and has part some water and has part some sugar

we would get that Tiramisu is a kind of coffee - which is obviously bogus - it just contains coffee (please don't judge my recipe above - it needs some work). So the correct way to model this would be:

# Modeling solution B:
coffee = mixture that has part some coffee beans and has part some water and has part only (coffee beans or water)
tiramisu = mixture that has part some coffee beans and has part some water and has part some sugar and has part only (coffee beans or water or sugar)

Now we get a bit into rococco land (@cmungall way of saying: complex OWL that is easy to get wrong and unusable by external tools).

For completeness, I just add modeling solution C&D as well, so we have them all in the same comment:

# Modeling solution C:
coffee subclassof mixture has part some coffee beans and has part some water
tiramisu subclassof mixture that has part some coffee beans and has part some water and has part some sugar
# Modeling solution D:
coffee subclassof mixture 
coffee subclassof has part some coffee beans 
coffee subclassof has part some water
tiramisu subclassof mixture 
tiramisu subclassof has part some coffee beans 
tiramisu subclassof has part some water 
tiramisu subclassof has part some sugar

Discussion

Its obvious that only solutions B and D matter to this discussion. I can't tell you what to do @kaiiam, but here is my preferred solution (MIX_ALT1):

  1. Use Solution B, but add a separate module to ENVO that asserts the subsumption derived from the mixtures. Here is an example of how we do this as part of DPO. This allows you to use ELK in your day2day business and only need Hermit (or Konclude, which is what we do) once when you create a release.
  2. Make sure that you run ROBOT relax on the OWL product after compiling the pattern this gives you some version of D as well.

The alternative MIX_ALT2 is this:

  1. Go with Solution D, but add a dosdp pattern that states that if you are a mixture, your ingredient list can be considered exhaustive.
  2. You need to manually assert all subsumptions between mixtures.

Let me know what you think. To alleviate some of the stress of @diatomsRcool and @laurenechan you can also just add the terms so they can consider the matter closed, while we still figure out how to do the modelling.

cmungall commented 3 years ago

solution B is not correct because it mixes universals and transitive predicates. Tiramisu and coffee both have atoms and quarks.

(I know you know this and just forgot... :-)

(aside: this kind of trap is why I always say: keep it simple. avoid non-EL constructs unless you're really sure they are necessary for reasoning)

One way to get around this is to invent a non-transitive subpredicate. RO has has_component for this purpose. But this is problematic. Inference will be incomplete since sometimes you want the transitivity

Another way is to do this.

mixture that has part some coffee beans and has part some water and has part only (coffee beans or water or (part-of some coffee beans) or (part-of some water))

but this is complex, you'll never get classification from elk

matentzn commented 3 years ago

Ahhhhh!!! How embarrassing!!!

That was a long ticket and one that illustrates nicely the problem of complicated modelling.

OK, I guess I suggest then to go for MIX_ALT2 solution.

If no one disagrees, I will finalise the template with Lauren, and @kaiiam can merge it in. Thanks for all the feedback!

pbuttigieg commented 2 years ago

Was the outcome of this documented ?