Closed stuckyb closed 1 year ago
Thanks for that, @ramonawalls . I have 4 comment on this topic.
First, for this discussion, it is important to remember that the methodology we are using to go from data about a portion of a plant
to data about a whole plant
is outside the scope of implementable description logics supported by OWL and associated reasoners. So arguments about logical entailments in OWL are not necessarily the main consideration here.
Second, to the immediate question of whether we can use the existing part of
in RO, let me be sure I understand the argument here. The idea is that instead of having a property that explicitly states proper parthood, we get this implicitly by asserting that a portion of a plant
is a whole plant
minus something, and then, when presented with a statement asserting that some portion of a plant
is part of
some whole plant
, we conclude we have a case of proper parthood because a portion of a plant
cannot be a whole plant
. If we are certain that the last bit can never be true (a portion of a plant
cannot be a whole plant
), then I think the logic works. However, here are two reasons to prefer the model with a property that explicitly expresses proper parthood.
whole plant
(e.g., a single existential quantification axiom). With the alternative approach (i.e., reflexive part of
), all downstream axioms and rules require both an existential quantification axiom and an intersection axiom. This might sound like a trivial difference, but as you know, I spent a lot of effort optimizing reasoning times with the PPO (including custom axiom manipulations in OntoPilot) to make large-volume reasoning possible. The complexity of a logical model matters a lot, and in my experience, small differences in axiom complexity can make huge differences in computing time. I don't know how much of an impact the difference discussed here would make without testing it, but it should at least be a consideration.portion of a plant
. With an explicit "proper parthood" relation, we can give a concise logical definition of portion of a plant
: plant structure
AND is or was part of
SOME whole plant
. We lose that with a reflexive part of
.Third, we should keep in mind that given our current data model, part of
doesn't give the temporal coverage we need. An herbarium specimen is no longer part of
a plant. Now, I think we could address this by making the data model more complicated. E.g., phenology data are about an herbarium specimen
(or whatever) that is derived from a portion of a plant
that is a part of
some whole plant
. Similarly, for photographs, the data are about a photograph
that is an image of a portion of a plant
that is a part of
some whole plant
. I think that is all logically sound and it eliminates the need for was
in the property, but at the cost of substantially increased model complexity, which usually has computational downsides (see above).
Fourth, I've been thinking about an extension to our logical model that might be another way to address some of these concerns. What if we added a way to record the proportion of a whole plant
represented by a portion of a plant
? E.g., this portion of a plant
is 50% of the whole plant
, but this one is only 10%. For herbarium specimens, that might not be so relevant, but for photographs it definitely could be. E.g., a single photograph of a tree documents ~50% of the whole plant
. This information could ultimately be used to help clarify portion of plant
/ whole plant
distinctions, and it could also be useful for attaching confidence scores to data generated for a whole plant
from an observation of a portion of a plant
. For instance, if an image of a tree does not show any flowers, we can be certain that the portion of a plant
has no flowers, but with our current inferential model, we can't say anything about whether the whole plant
has flowers. With proportion information, we could reasonably assert that we are 50% confident (or whatever) that the entire tree has no flowers. This last point might go into a separate issue if it is something we'd like to pursue.
Thanks for putting this here, @stuckyb. I will respond to your four points. Please bear in mind that I am in large part conveying the arguments of others, and until I have done some playing with this in Protege and the pipeline, I am not certain what the best solutions is. Also remember that the people giving advice are less familiar with our project than us, but have done a ton of similar work.
That was a part of their point - we can't fully implement this in OWL, so they didn't see any value in creating a proper part of relations.
Your assessment of the suggested logic is correct. Again, bear in mind it was just a suggestion - no guarantee it will work the way we want it or be easy to implement.
Regarding you concerns, the suggestion was not to use reflexive part of, but to use the current RO which is neither reflexive nor irreflexive (so when reasoning, it won't through an error for either type of instance). While I fully appreciate the requirement to keep axiomatization and minimal as possible, the argument was that using a proper_part_of relation would not actually work. We aren't interested in all proper parts of a whole plant, rather only in parts of a plant in which a significant portion of the plant is missing. I think for most of our traits, the minimal part that would need to be missing for us to not be able to infer absence on the whole plant is a shoot system (shoots system includes branches, flowers, and buds). Maybe it would need to be missing only a leaf, for leaf traits. We might create a relation that is called proper_part_of and define it to mean what I just described, but I was pretty convinced by Chris that a true proper part of relation is meaningless in many cases. That said, I can imagine that there might be a way to make it work.
Along these lines, I'm not sure why we need to have a single relation that covers both cases, since we would normally know if something is part of or was part of. I guess it makes the ingest pipeline easier to not have to deal with two separate types of data, but I'm not sure how much.
On the other hand, I think this corresponds in part to the definition discussed in point 2, which was that we define portion of plant based on what is missing from the plant, rather than just saying anything is missing.
Thanks, Ramona. Interesting points for sure. I am thoroughly convinced of at least one thing -- there are no obviously correct answers here.
One quick comment, though -- as I see it, the existing RO part of
is reflexive. The definition clearly says so. Assuming the definition means what it says, then the absence of a corresponding logical axiom is a bug, not a feature, and we probably ought to treat it as such.
A point of clarification here regarding using is or was part of
for connecting portion of a plant
and whole plant
.
The reasoner will not be able to infer traits of whole plant
based on traits of portion of a plant
when we are using for example lower count
on portion of a plant
. For example, i assert an 'upper count' for a part of plant... We would NOT be able to infer 'upper count' for a whole plant. However, i would reasonably expect to say the 'lower count' of a 'portion of plant' should be at least the 'lower count' of the whole plant, but not necessarily the same number. Anyway, just wanted to add this comment to this thread to clarify what the boundaries are here...
We discussed this at our workshop, and we don't see any need to change it at this time.
See also the decision about scoring parts of plants in issue #68.
I'm creating this issue to document ongoing discussions about how we connect a 'portion of a plant' to a 'whole plant' and the logical inferences implied by that connection.
Here is the most recent email on this topic, from @ramonawalls :+1: