semanticarts / gist

Semantic Arts gist upper enterprise ontology
Creative Commons Attribution 4.0 International
156 stars 18 forks source link

The hasX/hasDirectX pattern is problematic #115

Closed uscholdm closed 3 years ago

uscholdm commented 4 years ago

The hasX/hasDirectX pattern is problematic in that hasDirectX often reflects a modeling choice about granularity that has nothing to do with the world. A different modeling choice can make a prior assertion untrue. For example:

ORIGINAL GRANULARITY:

REVISED GRANULARITY:

Suddenly WashingtonState hasDirectPart Seattle is no longer true, but the world did not change, only the choice of granularity. We normally think of hasX being inferred from hasDirectX assertions. That may be backwards. Maybe we need to infer hasDirectX assertions from the fact that there are no intermediary assertions.

So, from the following

We can infer:

I’m not suggesting we do this, I'm just illustrating the problem.

At the moment I have not worked out a good proposal that addresses the above concern and still retains the intended benefits of the original pattern. Does anyone have experience that demonstrates the intended benefits have ever become real? If not, we might want to get rid of the hasDirectX / hasX pattern and just use a single hasX transitive property.

DanCarey404 commented 4 years ago

I have been thinking along the same lines. The hasDirectX pattern is one of those things that makes sense semantically, but which is proving less useful when put up against day-to-day data. And that is proving even more true when integrating data from multiple sources and/or which need to deal with different granularities. I have client instances where they need/want to jump straight from city to multi-country region. Using the :partOf property is semantically not wrong in that case.

uscholdm commented 4 years ago

From: Dan Carey notifications@github.com Sent: Sunday, 6 October, 2019 12:07 To: semanticarts/gist gist@noreply.github.com Cc: Michael Uschold uschold@gmail.com; Author author@noreply.github.com Subject: Re: [semanticarts/gist] The hasX/hasDirectX pattern is problematic (#115)

I have been thinking along the same lines. The hasDirectX pattern is one of those things that makes sense semantically, but which is proving less useful when put up against day-to-day data. And that is proving even more true when integrating data from multiple sources and/or which need to deal with different granularities. I have client instances where they need/want to jump straight from city to multi-country region. Using the :partOf property is semantically not wrong in that case.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/semanticarts/gist/issues/115?email_source=notifications&email_token=ACSHHSI42LQWFYXQ4KD77TDQNIEJFA5CNFSM4I5UB37KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAONTJI#issuecomment-538761637, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACSHHSNEMOFPAZCU5ZG3D73QNIEJFANCNFSM4I5UB37A.

tedhills commented 4 years ago

I think our recent discussions at the Gist Council have addressed this. The DE-9IM topological model teaches us that geo-regions that overlap do not nest in the way that containers can nest, which is exclusively. (Something cannot be in two containers directly at the same time.)

Thus, I believe that the resolution of this issue is to refrain from using :hasPart and :hasDirectPart to represent things that cannot nest exclusively. I think :hasPart and :hasDirectPart retain their usefulness in other contexts.

tedhills commented 4 years ago

Also, we are going to have a heap big pow-wow on "part of" at the January Gist Council, and I trust we will see that having a single :hasPart relationship is quite inadequate for describing the world. We need a set of relationships that map to the English phrase "part of" but that are ontologically distinct.

rjyounes commented 4 years ago

I'm in favor of at least removing hasDirectPart. This is also discussed in issue #102. Ditto for geoDirectlyContains and its inverse.

It makes sense for taxonomies - see issue #107.

directlyPrecedes/precededBy can be useful for ranking items in a collection. If it doesn't apply, you don't have to use it.

johnwcowan commented 4 years ago

Here's an example of the problems with hasDirectPart that is not mixed up with georegion questions, and is taxonomic. What are the parts of a motorcycle? (I'm drawing here Robert Persig's Zen and the Art of Motorcycle Maintenance)

Well, the list is long, but some of the parts are an alternator, a rectifier, a battery, a high-voltage coil, and spark plugs. Are they direct parts? Well, Persig's analysis says that they are parts of the ignition system, which is a part of the motorcycle. You can't order an ignition system from a parts dealer, but if you want to understand the motorcycle's structure, Persig tells us, you need to see that the ignition system is part of the engine, which is part of the power assembly, which is part of the motorcycle.

But then again, you might decide that the assembly taxon (consisting of the power and running assemblies) is too abstract for your purposes, and skip directly from the engine to the motorcycle. And every time you make a change like this you have to rearrange the part and directPart relationships. The same is true in biology: it's not uncommon for new and optional taxons to be inserted between the standard taxons.

"The first thing you learn in a lawin' family is that there ain't no definite answers to anything." –Calpurnia, To Kill A Mockingbird

tedhills commented 4 years ago

I disagree with the rationale expressed above for eliminating :hasDirectPart. The rationale seems to boil down to this: Since some :hasPart relationships are debatably :hasDirectPart and some are not, :hasDirectPart is meaningless or useless. I think this is untrue.

I think the following has been established:

  1. Geographic relationships--that is, relationships between regions of the surface of the earth--are topological and not compositional. In other words, a US state is not composed of counties, which are in turn composed of cities and towns. Rather, the territory of a city lies within a state and also lies within a county. The territory of Yellowstone National Park overlaps the territory of three states (Montana, Wyoming, Idaho). There is, in general, no compositional relationship between geographic regions.

  2. A :hasPart or :hasDirectPart relationship between any two given entities is not necessarily universally true. The relationship may need qualification by context.

For example, I think its unarguable that a spark plug must be threaded directly into an engine, thereby helping to complete the engine as an assembly. On the other hand, although I must introduce fuel directly into the vehicle's fuel tank, I would never say that the vehicle :hasPart fuel (or :hasDirectPart fuel).

So are you saying that it is no longer allowed for me to say, engine :hasDirectPart sparkPlug, just because I can't say vehicle :hasDirectPart fuel?

In the physical world, there are many :hasDirectPart relationships:

I really think that what is needed is a richer set of :hasPart relationships, rather than disallowing one such (:hasDirectPart) because it is not universally applicable. It is not supposed to be universally applicable, but that does not make it meaningless or useless.

tedhills commented 4 years ago

Here are some additional part-of relationships that I think an upper ontology could benefit from:

Physical composition nests. For example, a spark plug is aggregated from a metal shell, a metal electrode, and a ceramic insulator. The ceramic is a blended composite. The spark plug may be assembled into an engine.

Non-physical composition:

Physical, but not compositional:

uscholdm commented 4 years ago

@tedhills makes a good case for keeping hasDirectPart. There are cases when it does reflect truth in the world, as opposed to a somewhat arbitrary modeling decision about granularity that might change. We can discourage its use in those situations, but frankly, anyone who finds it very handy will do it anyway.

uscholdm commented 4 years ago

@tedhills suggested to add:

is-composed-of: physical composition gist:madeUpOf is for that.

rjyounes commented 4 years ago

I agree with @uscholdm and @tedhills about hasDirectPart. It is useful and valid in some, probably limited, circumstances, and we can make this clear in the annotations, but the ontology is not responsible for bad implementations of it.

marksem commented 4 years ago

What actions are needed to close this? It seems we agree hasDirectPart can remain, but its usage with places is discouraged. Perhaps we can close this issue by adding guidance in a skos:scopeNote that says "hasDirectPart and its inverse should NOT be used between things that are gist:Places"

johnwcowan commented 4 years ago

It shouldn't be used for any concreta at all, because there is always more than one way to divide a physical object into parts, hence mereology. See my post on Zen and the Art of Motorcycle Maintenance. It should be confined to abstract hierarchies where partOf is the transitive closure of directPartOf.

On Mon, Jul 6, 2020 at 1:00 PM Mark Wallace notifications@github.com wrote:

What actions are needed to close this? It seems we agree hasDirectPart can remain, but its usage with places is discouraged. Perhaps we can close this issue by adding guidance in a skos:scopeNote that says "hasDirectPart and its inverse should NOT be used between things that are gist:Places"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/semanticarts/gist/issues/115#issuecomment-654355853, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPPBWEIQEZDZFQSS6BFBDR2H7K5ANCNFSM4I5UB37A .

marksem commented 4 years ago

:D so is this still open because we have not come to an agreement? :D

uscholdm commented 4 years ago

Yes. However, I think that a consensus was emerging that althought there is a lot of misuse of the pattern, there are cases where it is legit, so lets not throw the baby out with the bathwater.

johnwcowan commented 4 years ago

Right. So this is a now question of what English-language warnings we should add to the description: it doesn't affect the machine-intelligible parts of Gist at all.

On Mon, Jul 6, 2020 at 1:43 PM Michael Uschold notifications@github.com wrote:

Yes. However, I think that a consensus was emerging that althought there is a lot of misuse of the pattern, there are cases where it is legit, so lets not throw the baby out with the bathwater.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/semanticarts/gist/issues/115#issuecomment-654375688, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPPBS2TMU2X6HJRPTLWPTR2IELLANCNFSM4I5UB37A .

uscholdm commented 4 years ago

@johnwcowan

So this is a now question of what English-language warnings we should add to the description:

Start with the original statement of the issue which describes the problem, then give examples of legit usage. And make a recommendation for those who wish to 'misuse' the property anyway.

tedhills commented 4 years ago

John, I couldn't find your "Zen and . . ." blog, though I tried. Perhaps you could send me a link to it. Lacking that, I will still respond that, although yes, in general, there are many ways to divide something into parts, I think it remains that some parts can be put together directly; that is, without anything mediating their connection. I see that as the meaning of :hasDirectPart.

Here is my suggestion for defining :hasDirectPart.

"The subject is composed in part of the object, where nothing mediates the composition.

Both the subject and the object exist independently of each other.

Here are some examples: --an automobile engine :hasDirectPart a spark plug --a keyboard :hasDirectPart a key

Relationships between topological regions--for example, regions of the Earth's surface--are not, in general, compositional. Such relationships are better described with "overlaps", "contains", "borders", etc. See the standard DE-9IM."

rjyounes commented 4 years ago

Since we are moving towards skos annotations, I would break this up into skos:definition, skos:example, and skos:scopeNote; moving forward we will not lump everything into the definition. Additional comments:

At this point our process defines the way forward: we will triage it in our review sessions and, if action is required, assign to a team member to make one or more concrete proposals. We do not expect to come to agreement in the comment thread. If we decide to act on this, we may want to break it up into separate issues for each predicate, as in #328, since we may not have a blanket decision affecting all such predicates.

Regarding subtasks, it's worth considering modeling them as an OrderedCollection - see #112. This might be true of other cases as well.

sa-bpelakh commented 4 years ago

Decision made by @sa-bpelakh @marksem @uscholdm @ungricht - retain direct/hasDirect (non-geo) in place, add guidelines for usage (asserted vs not, interaction with inference) to definition of each of these properties.

rjyounes commented 4 years ago

DECISION: Add annotations as noted above

rjyounes commented 3 years ago

Added this to February project since an implementation decision was made several months ago.

rjyounes commented 3 years ago

I would like to pass this on to one of @sa-bpelakh @marksem @uscholdm @ungricht since they were present at the time the decision was made. Please add yourself as assignee if you are willing. I believe I was assigned to this issue at an earlier stage and should not have been assigned the implementation.

uscholdm commented 3 years ago

DECISION: Add annotations as noted above

@rjyounes I'm having a hard time discovering what 'as noted above' means exactly, and I don't clearly recall the decision being made. Does anyone remember the decision?

rjyounes commented 3 years ago

@uscholdm The decision was made on July 23, 2020:

retain direct/hasDirect (non-geo) in place, add guidelines for usage (asserted vs not, interaction with inference) to definition of each of these properties.

I think if you cull through the suggestions above you will be able to piece together some appropriate annotations, though I see that there was a lot of discussion of how the annotations should read and no final conclusions indicated. If you are not certain, it might be best to write up your suggestions and bring them back to the group at the next meeting for approval.

uscholdm commented 3 years ago

PROPOSAL:

rjyounes commented 3 years ago

@uscholdm This proposal looks good to me.

tedhills commented 3 years ago

On Apr 15, 2021, at 9:00 PM, Michael Uschold notifications@github.com wrote:

@ted said:

I think our recent discussions at the Gist Council have addressed this. The DE-9IM topological model teaches us that geo-regions that overlap do not nest in the way that containers can nest, which is exclusively. (Something cannot be in two containers directly at the same time.)

Thus, I believe that the resolution of this issue is to refrain from using :hasPart and :hasDirectPart to represent things that cannot nest exclusively. I think :hasPart and :hasDirectPart retain their usefulness in other contexts.

@tedhills can you translate this into every day language for a typical OWL modeler? I'm having a hard time working out what you mean. Not sure exactly how nesting relates to parthood, but they don't seem the same to me. Maybe just don't use the term 'nesting' and stick to parthood?

@uscholdm Here's my reply: Nesting is relevant becasue :hasPart allows nesting–that is, transitive parthood–while :hasDirectPart does not. For example:

:Engine :hasDirectPart :SparkPlug
:Automobile :hasDirectPart :Engine

but then

:Automobile :hasPart :SparkPlug  # not :hasDirectPart

In other words, :hasPart is transitive and accommodates nesting, while :hasDirectPart does not.

Parthood is compositional. By this I mean that if you remove the part you make that from which you removed it not whole. An automobile without an engine is not whole. An engine without a spark plug is not whole. An automobile without a spark plug is also not whole, because parthood is transitive.

Here's another example of a use of :hasDirectPart. Columbia University divides itself into many colleges. Each college has its own faculty. A professor can only be part of a college faculty. There's no notion of a professor being a part of the university faculty, except indirectly.

:University :hasDirectPart :College
:College :hasDirectPart :Faculty
:Faculty :hasMember :Professor
:University :hasPart :Faculty  # not :hasDirectPart

In contrast, :geoDirectlyContains/:geoDirectlyContainedIn are illegitimate and should be eliminated. That's because, as we learned from DE-9IM, a relationship between overlapping regions should never be seen as transitive or intransitive. There cannot be an idea of direct versus indirect containment with geographic regions. Yes, we can say that the region of a city is contained in the region of a county, which is contained in the region of a state, but that cannot lead to a conclusion that the region of the city is indirectly contained in the region of the state. It is still just contained in it.

Above I've given an example of the use of :hasDirectPart with mechanical assembly. I've also explained how geographical relationships are not parthood relationships. We commonly use words like "part of" to describe relationships between geographical regions, but I think these ontological relationships as defined are more precise than the natural language we use to express them. Language guides ontology but does not dictate it. The goal of the ontology is to represent semantic truths in formal logic, which will often not line up well with our natural language.

Here's another example of the use of :hasDirectPart, although this example also requires an updated definition of :Country.

:Country :hasDirectPart exactly 1 :GovernmentOrganization
:Country :hasDirectPart some :GeoRegion

This example dictates that a country has some direct parts and can't have them indirectly. For example, if there were some things (say, national parks or office buildings) that were part of a :Country, and they had :Regions, that would not be enough to define the :Country, because they would not be part of the :Country directly. The proper understanding of a :Country is that it consists of exactly one government that has control over some region(s). There might be other entities having regions that are contained within that country's region(s) but that is not enough to establish the country as a country. For example, the US is not just a collection of states. It is a single federal government together with the regions it controls, directly. The regions of the states (and some territories, too) are geo-contained within the regions of the US, but that is not a part-whole relationship.

I hope this helps.

uscholdm commented 3 years ago

Those are additional good examples. I proposed a solution for this issue, see if you agree.

rjyounes commented 3 years ago

We should acknowledge that whether or not hasDirectPart is valid is in some instances use-case dependent. For example, you could say a room is a direct part of a building. Or you could say it is a direct part of a floor which is a direct part of a building. Or you could say it is a direct part of a wing which is a direct part of a floor which is a direct part of building. Or you could decide not to use hasDirectPart because it's too complex for your use cases. In a context where I am only modeling rooms and buildings, hasDirectPart would have a valid use. In other contexts it wouldn't. Our decision was to allow this pattern except for geo containment, and provide guidance to users about when it might or might not be appropriate, aside from the direct geo containment predicate, which we removed. (But is building/floor/wing/room geo containment or not?)

Similarly with the university example: one may also be considering departments, in which case faculty are direct members of departments rather than colleges. This gets really tricky and again we need to leave it up to users, because it's nearly impossible to decide what's a direct part of something else on a universal basis. If nothing else you might forget a layer, as you did with the university example. Users can write SHACL to control how hasDirectPart is used.

marksem commented 3 years ago

Many seem to want to distinguish hasDirectPart from hasPart for transitivity reasons. But it is interesting to note that 1) we rarely run reasoners on our instance data anyway (at least with my clients), and 2) the query languages now allow ease of querying transitively or not (SPARQL's hasPart+ vs. hasPart).

tedhills commented 3 years ago

I'm a bit confused. Since you can always run a property path SPARQL query (?s gist:hasPart+ ?o), the query can always figure out whether a :hasPart relationship is direct or indirect. But then, what is the point in OWL offering the transitive characteristic on an object property? If an object property is transitive, that is saying that ?s :property ?o and ?s :property+ ?o mean the same thing. A query can't deduce that meaning unless it queries for the transitive characteristic on the :property, which would require the property definition to be loaded in the triple store. So, I think the only reason for distinguishing between :hasPart and :hasDirectPart is to support inferencing, where something is inferred from :hasDirectPart that is not inferred from :hasPart. So I think the distinction is still important.

uscholdm commented 3 years ago

Since you can always run a property path SPARQL query (?s gist:hasPart+ ?o), the query can always figure out whether a :hasPart relationship is direct or indirect.

It can determine whether there happens to be or happens not to be an intermediate link. They are different. Ted's spark plug example highlights when a relationship is necessarily/semantically a direct one. Otherwise its just happenstantial, or a granularity decision that can change whenever.

sa-bpelakh commented 3 years ago

I'm a bit confused. Since you can always run a property path SPARQL query (?s gist:hasPart+ ?o), the query can always figure out whether a :hasPart relationship is direct or indirect. But then, what is the point in OWL offering the transitive characteristic on an object property? If an object property is transitive, that is saying that ?s :property ?o and ?s :property+ ?o mean the same thing. A query can't deduce that meaning unless it queries for the transitive characteristic on the :property, which would require the property definition to be loaded in the triple store. So, I think the only reason for distinguishing between :hasPart and :hasDirectPart is to support inferencing, where something is inferred from :hasDirectPart that is not inferred from :hasPart. So I think the distinction is still important.

@uscholdm is correct, without the dedicated property to indicate a direct link, the only way to locate the direct links (e.g. in order to build a hierarchy) is

SELECT ?child ?directParent where {
  ?directParent gist:hasPart ?child
  FILTER NOT EXISTS {
     ?directParent gist:hasPart ?intermediate .
     ?intermediate gist:hasPart ?child .
  }
}

which is quite cumbersome.

rjyounes commented 3 years ago

There are two ways you might get a direct link between two objects: (1) using hasDirectX rather than hasX, and (2) a direct link using hasX where there happens to be no intermediate node. These are two different cases - and therefore, as @sa-bpelakh says, there is a semantic difference. Eliminating hasDirectX conflates the two.

I'm also not in favor of using SPARQL as a substitute for semantics in the ontology.

uscholdm commented 3 years ago

Implemented Fix:

Updated annotations for the following properties regarding the hasDirectX/hasX pattern.