Part role? - Githubissues

CommonCoreOntology / CommonCoreOntologies

The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.

BSD 3-Clause "New" or "Revised" License

175 stars 51 forks source link

Part role? #116

Closed alanruttenberg closed 2 weeks ago

alanruttenberg commented 3 years ago

In #114 @mark-jensen mentions the term part role. That sounds kind of fishy. I read the definition and I don't see why it would be considered a role. I had a look at the definition source and also didn't see an indication that this should be considered a role.

"A Role that inheres in an entity in virtue of it being part of some other entity without being subject to further subdivision or disassembly without destruction of its designated use"

"designated use" suggests there's a role in the vicinity but, as we're talking about hardware, it seems more like a function (something designed) that is important.

What seems to be involved:

A specific dependency of a function or role of the system on a function of the part in question.
A property of the part that any disassembly (even, for instance, losing one of several fastening screws) results in loss of function.

Actually, the specific dependence might be wrong if we are envisioning scenarios in which a part can be replaced. In that case the condition is more like the system has a part with a certain type of function, and the dependence is that there is some part that bears an instance of the function. This is a kind of generic dependence but not the same kind a GDC dependence.

I'm not clear on what the subdivision condition means. As a test, I'll take as an example something like an image sensor which has a tiny microcontroller integrated into the same silicon. Then a replacement with subdivision could mean something like replacing it with a discrete sensor and microcontroller (the two subdivisions). But in that case, since function is preserved it's hard to see why that would be a problem.

There's also the issue of it being a potentially confusing name. The BFO2 reference says:

Note that the first clause in the above [definition] ensures that parts of wholes (for example your heart, which is a part of you) do not s-depend on the wholes of which they are parts.

The requirement that the domain of s-depends is a specifically dependent continuant rules both of those out, but the fact that Barry thought to address this anyways suggests the potential for confusion.

I may be completely off base in which case you can relabel this issue as the term needing better documentation.

swartik commented 3 years ago

I'd like to hear the perspective of those involved in developing the PartRole class. @rorudn helped me use it to solve a problem I had expressing that, during cataract surgery, a lens I was born with ceased being part of me and the artificial lens I had implanted became part of me.

In my situation, I needed PartRole because OWL cannot directly express that a relationship applies to a specified temporal interval. Was this the motivation for introducing Part?

It seems to me that using PartRole like this is an attempt to force everything within the BFO paradigm, where everything must be either a continuant or an occurrent. That is justifiable in FOL. I'm not so sure it's justifiable in OWL, which as we all know isn't as expressive as FOL. I tried alternate representations and decided I was happier reifying statements, even if I had to use some classes outside BFO. Why compensate for OWL's shortcomings using an awkward representation?

This is just my opinion and, as I said, I'd like to have CCO experts weigh in. Are there reasons to use PartRole aside from introducing temporal qualifications? Did you consider other ways to express temporal part-of relationships, and if so, what were they? Do description logics allow interesting reasoning possibilities using PartRole individuals?

alanruttenberg commented 3 years ago

@swartik, in BFO-2020 there are temporalized relations of the form rel-at-some-time, rel-at-all-times. In your case you could use continuant-part-of-at-some-time to relate each of those parts to the whole.

Would you mind sharing your representation using PartRole? I don't have a picture of how using the role helps you solve this problem. I'm also curious if you have a way to temporally order the parts, communicating that the natural lens was part earlier than the artificial lens.

There's is another way in BFO to add time to the equation which is by using histories. A history is a process that is 1:1 to the material entity it is history of. You can think of if as the life of a material entity. Using histories you can say something like: the history of the lens is (occurrent) part of the history of the person, and an analogous assertion for the artificial lens. You can then relate the temporal intervals the processes occupy with a precedes relationship. You might be able to use property chains to have the assertion of the history part leads to an inference involving part-of-at-some-time on the continuant side.

If indeed PartRole is a mechanism built for adding time, then note that there's two different workarounds, three if you can the BFO-2020 temporalized relationships. See #112 and my assertion that it looks like a workaround for a time-dependent instance-of relationship.

Regarding replacement with an artificial lens, you may be interested in something like prosthetic role.

I don't see how the constraint that everything is an occurrent or a continuant that differs in FOL vs OWL. Could you elaborate? There's a difference between what sort of assertions can be made(expressivity), and what kind of inferences are made in one vs the other. But the ontological choices are independent of the particular representation. Nothing in the language of implementation changes the basic facts that parts, in fact, can come and go.

Reifying statements is a reasonable thing to do, as long as it's understood what is gained and what is lost. Reification makes it possible to add times and query based on them. But there are also problems such as making an effectively transitive relationship or asserting subproperties. What I'm always thinking about is what are the queries I'm going to need to do and what kind of inferences will be needed to support those. Reifying gives a different answer than the BFO-2020 temporalized relationships, for example. Unfortunately all possible solutions to this problem will involve some awkwardness, because, as you correctly point out, OWL is less expressive than FOL.

We do need a solution for better expressing time-dependent things and it might be worth considering implementing that using a hybrid reasoner - OWL for some inferences, a temporal reasoner for some. Such combinations lose some some nice features of OWL, typically completeness but are useful if you understand the limitations. A CCO project used hybrid reasoning because part of the reasoning involved doing graph matching. The base was a triple store. To do the graph matching they extracted what was needed by query, ran the graph matching, and then added the resultant assertions capturing the matches back to the triple store.

rorudn commented 3 years ago

System, Component and Part were introduced as roles in CCO as a means to describe the place of an artifact in a system architecture. A product breakdown structure of one system might describe a microcontroller as a component while a product breakdown structure of another system would describe a microcontroller as a part. At the level of instances, if the microcontroller was removed from a system of the first type and used as a replacement in a system of the second then that microcontroller changes from component to part. The functions of the controller are the same, but the interfaces it has with other components are parts have changed.

The temporal aspect of the role is handled by using stases. Microcontroller_1 and component_role_1 participate in stasis_1 that occurs on temporal_interval_1. Microcontroller_1 and part_role_1 participate in stasis_2 that occurs on temporal_interval_2.

alanruttenberg commented 3 years ago

@rorudn OK. I understand and, as you describe them, the roles make sense. i might consider making their names more specific. "system part' vs 'part' for instance, given the potential confusion I point out.

The roles, then, are separable from the temporal aspect and it's the temporal aspect that I'm also concerned about. I'll leave that for other discussions, such as #112, other than to say that rather than the role participating in the stasis process, it is realized in the process. That realization isn't necessary the only kind of realization.

The question I have then is: what's the system. Given that the roles are applicable to artifacts, and that you mention architecture, a system (thing with a system role) also sounds like an artifact. That's the context of the definitions from the definition source. I see that the definitions are adapted from the originals to be more generally applicable. I'm not sure that's a good decision. Also worth considering is whether the system is an object aggregate and what are called the parts, the members.

The usage of these in @swartik's use case is questionable. A person is not an artifact. It isn't engineered. Arguably, the artificial lens takes could take one of those roles but the natural lens would not. In addition, the subdivision or disassembly conditions don't seem to apply. Rather, it is the function that matters and not the composition. For instance, the artificial lens could be engineered to be, instead, an almost transparent sensor / microcontroller and actuator that stimulates the muscles that normally cause the natural lens to focus and in some other way act to make the natural lens more flexible. Or, it could be a multipart lens replacement with sufficient mechanism to reconfigure the parts to dynamically focus, like some radio telescopes. In that case it would accomplish more of the original function than the rigid replacement lenses we have today.

In ontology terms, rather than talking in terms of subdivision and disassembly, it might be more clearly stated in terms of unity criteria. The designation of something as part is saying that the thing with the role is a whole, for some purpose - the thing that carries a function necessary for the system as a whole to function.

Outstanding: I asked about what subdivision meant. TBH, I think the description you give might form a better basis for an understandable definition than the current usage of dissassembly/subdivision. There's at least one person who is confused by it - me. If I'm confused I'm guessing other users will be as well.

alanruttenberg commented 3 years ago

The description you give also suggests axioms. E.g. the part role only inheres in something that is part of something with a system role. Having that captures aspects of your description that the current definitions doesn't.

swartik commented 3 years ago

@alanruttenberg here is an image of the representation. I see that I misspoke. We only concerned ourselves with the artificial lens, not the natural lens.

LensRole

Modeling removal of the natural lens would complicate interpreting the model. The picture asserts that cataract-surgery is-cause-of stasis-of-parthood-role-1. In this case the meaning is that it's the cause of the Part Role beginning. If you used an analogous representation for the natural lens, the meaning would be that it's the cause of a Part Role ending.

I don't know that this is insurmountable. It does require establishing some conventions. And, of course, it's significantly more verbose than an FOL representation.

alanruttenberg commented 3 years ago

A few of comments/questions for now. I'll say more in another comment.

I don't see a type for stasis of parthood role. There isn't a class stasis of parthood role in the version of CCO I'm using - cloned from github repo. If not that, what is the type?
I'd suggest having a class cataract surgery subclass of intentional act.
Causal relationships are very difficult to define properly and for the most part I avoid them. Rather than say cause I think it's less problematic to have a relation that expresses that it is the the intention of the act is, in part, to start a process, or to add an axiom to the effect that that every (successful) cataract surgery occurs on an interval that meets an interval at which the stasis occurs on.
By using has text value you lose the ability to do queries that understand that the value is a time. I see there's a 'has date value' and 'has datetime value' so using one of those would allow you to do queries that understand that there's an ordering of dates, such as a date range query. 'has date value' doesn't specify the range as xsd:date and should. I'll add a separate issue for that.
I'm glad to see you are using the temporalized relationships.
You mentioned reifying statements but I don't see any here.
It would be helpful to know what sort of queries you would want to do that involve this model.
There's a larger discussion that I plan to have with Ron about the suitability of IBE here. I think the has value relationship belongs on the ICE, not the IBE.
Is your comment about the meaning of caused is that it is the Part Role beginning a comment on what's explicit in the model? Or is it extra information. I'm thinking it's explicit via the interval started by relationship, but wanted to check.

See also my previous comment about whether Part Role is suitable to use in this case as a person isn't an engineered thing. A Part Role can be had by part of something with a System Role. That isn't explicitly represented in CCO, but Ron's comment and the definition source makes it clear that this is the intention.

I'll comment on the issue of how to specify the concrete interval in a later message.

I like the diagram. Did you draw the diagram by hand, or do you have a tool for that?

swartik commented 3 years ago

@alanruttenberg, in this comment I'd like to focus on a particular bullet of yours:

There's a larger discussion that I plan to have with Ron about the suitability of IBE here. I think the has value relationship belongs on the ICE, not the IBE.

As I said in #118, representing provenance is important to me. If I have a fact, I want to be able to quickly look up the text, or the image, or the audio, that motivated me to express it. That would be an IBE, which has some physical representation, even if that representation is ultimately a collection of electromagnetic fields. My point is that I want an IBE to have a resolvable IRI, preferrably a URL, although I'm content to use a URN if it allows me to find the appropriate physical object. By contrast, I don't care whether I can resolve an ICE. It can be even an anonymous individual. This is my personal convention: if I want to look up the source of some fact, I'll use its associated IBE.

That said, I have never been comfortable using most of the has_*_value properties. I can see using has_text_value and has_uri_value. The other properties are interpretations of a text string (or the interpretation of an audio file, image file, etc.). That kind of interpretation is what I've always considered appropriate to an ICE rather than an IBE. The triple:

 <individual> cco:has_datetime_value "2021:04:01T12:00:00Z" .

is an interpretation of any of the strings:

 April 1, 2021 at noon
 4/1/2021 at noon
 2021-04-01 at noon

If I could have one ICE that generically depends on three IBEs – that is, if <individual> was an ICE and the IBEs used has_text_value to record the format in which the datetime was expressed – my KBs would be simpler.

alanruttenberg commented 3 years ago

I want to be able to quickly look up the text, or the image, or the audio, that motivated me to express it. That would be an IBE, which has some physical representation, even if that representation is ultimately a collection of electromagnetic fields.

The text, the image, the audio would be an ICEs. Every ICE has some physical representation, by definition. An ICE is a BFO generically dependent continuant (GDC). A GDC can exist as long as there is some copy of it, each copy of which specifically depends on an IBE. However, over time it need not be the same IBE(s).

If you are looking up text, it is typically the ICE that you are looking for - you don't care about what the medium was because what matters is the text. A case where you are interested in the IBE would be talking about a library book that you took to the library and that you are expected to give back. Or, a passport. As while the passport information(ICE) can be copied (creating new concretizations on different IBEs), there is a distinguished IBE - the physical passport issued to you by the govt. - that you have to hand to immigration when crossing a border.

An ICE is what is in common among IBEs that have copies. It's the content of the novel rather than any of the perhaps millions of IBE books that were printings of it. When you send an email, you are not sending an IBE - that's something you must do using physical mail. In the process of the email getting to its destination, many copies (concretizations) are made on many IBEs along the way. Most of the concretizations are ephemeral. An IBE (e.g an SSD) endures but will concretize many different ICEs as it participates in the process of relaying mail. Unless we are running a server farm, we usually aren't interested in which IBEs.

The schema is: There is content - the ICE. It's existence depends on there being at least one copy, but there may be any. When we talk about a copy what we usually mean is that there's another IBE, with some pattern - a quality that is the concretization of the ICE, the pattern typically being the shape of the letter/words on paper, or the state of some digital memory that resides in charges on some substrate.

Language: My friend tells me about a book he read. I say I've read it. When I say I've read it, the "it" is the ICE. I don't usually read the same physical copy of the book, the book being the IBE.

There is one exception to this pattern. A process may also be a concretization. Suppose I read a poem to you and you write it down. At the end there are two IBEs on which the poem is written. But the speaking process is also a concretization, albeit ephemeral.

swartik commented 3 years ago

Part of me agrees with you that the text, image, etc. would be the ICE. But we're talking about the web. My expectation is that if I retrieve a resource using a URL, I'm going to receive text or an image.

I can think of a thousand counterarguments. For one thing, although that's what the web was supposed to be, it hasn't worked out that way. Ontologies and knowledge bases expressed in OWL and RDF are notorious for using URLs that don't resolve. Almost no one uses URNs. For another thing, the text you retrieve using a URL is HTML, not an unformatted string of the sort I've provided in examples.

I don't want to argue this point to death. I just hope CCO can have a workable solution that's agreeable to everyone and satisfies use cases such as mine. I will ask one question. You wrote:

Every ICE has some physical representation, by definition.

Once, in an exchange with @rorudn, he said every ICE has the potential to have a physical representation. Is that BFO's position too, or must at least one physical representation exist?

swartik commented 3 years ago

One thing standing in the way of resolving this problem is that few of the datatype properties have definitions, and the definitions aren't precise. The definition of has_text_value is:

A relationship between an Information Bearing Entity and a string representation.

The nature of the relationship being undefined, it's left to our imagination to fill in the gaps. My interpretation is something along the lines of "a person would interpret the IBE as containing the specified text." It works for my needs. But I'm not an authoritative source.

alanruttenberg commented 3 years ago

Is that BFO's position too, or must at least one physical representation exist?

It is not BFO's position. There must be at least one concretization. In the case where the concretization is a quality or realizable entity, that concretization, being a specifically dependent continuant, necessarily inheres in an independent continuant. All usage I've seen is narrower in that the dependence is on a material entity. An ICE is a GDC, and this is the way GDCs work in BFO.

In the case of a process concretization there must be participants.

To the extent that there is a potential, it is that every concretization of an ICE has the potential to be copied, yielding a new concretization.

My expectation is that if I retrieve a resource using a URL, I'm going to receive text or an image.

There are two cases. You describe the first case - where the resource is a digital thing. In the case that the URL denotes something that isn't information, like a car, the resource can't be retrieved over the web. http-range-14 explains the protocol in this case - a 303 see other is returned, and the "other" is some representation of the resource - a digital thing like text or an image.

Ontologies and knowledge bases expressed in OWL and RDF are notorious for using URLs that don't resolve.

Indeed. That's discourteous although, admittedly, it's hard to make IRIs continue to resolve for the long haul. We're aiming for better and our model is the OBO Foundry, where the IRIs are expected to be resolved and, barring error, are resolved. See, for example, ICE as defined in OBO. Viewing that resource in a web browser you see human readable information. Doing a wget or curl yields a 303 per http-range-14, and provides a link in the header which, when accessed, yields RDF. Well, actually, it returns a 301 moved permanently, and the header in that response retrieves RDF. But 301s are usually handled transparently. The point of #114 is to contribute to arranging things so that we can make sure CCO/Foundry IRIs can be served into the indefinite future.

For another thing, the text you retrieve using a URL is HTML, not an unformatted string of the sort I've provided in examples.

The data you receive is whichever mime type the server chooses to return. It is not restricted to HTML, and can certainly be text/plain if you want it to. In the case of OBO, detailed above, what is returned is RDF.

The unformatted strings in the example above is indeed problematic in that they are only usable for a person to read, or for some parser outside the specification to, in an agreed-upon way, turn it into something that can be properly computed with, which for dates is one of the appropriate xsd datatypes. Usually I would use strings like that as labels of ICEs.

In the diagram above you have a representation "morning of 2020-09-17". In the diagram on #118, an xsd:dateTime is used, which is what I would have expected as the use case is to, for example, retrieve a things based on that time and doing so is a pain with the text representation.

In other cases, you have to understand the ontology to be able to understand the ICE, such as when measurement units are used as part of an ICE.

Almost no one uses URNs.

That's good. I'd have to look up the conversations/presentations but it was argued closer to the beginning of the semantic web that http(s) can and should be used as identifiers in ontologies. http-range-14 removed a roadblock to that, clarifying that http IRIs can denote not only digital resources, and specifying the suggested protocol in such cases.

Earlier on there was a proposal for a protocol specifically for biological entities, called LSID. It was widely considered a failure as it required users to install extra software for them to be accessed. That's because nothing that LSID tried to do couldn't be done by http, and http is just easier to work with, being a pre-installed component of virtually every operating system.

A more successful case of a different protocol is DOIs, which are based on handles. However, everyone uses the http version of the DOI rather than the handle resolver.

neilotte commented 2 weeks ago

@alanruttenberg Converting to discussion.