archesproject / ARM_Working_Group

Arches Resource Model Working Group. A repo for community reviewed Arches branches, models, and packages
2 stars 2 forks source link

P148 vs P106 #9

Closed azaroth42 closed 4 years ago

azaroth42 commented 6 years ago

For the majority of cases (that we consider in Linked.Art at least), we only need one way to partition text or other content, however there are two predicates in CRM for this:

P106 (Symbolic Object has_part Symbolic Object) P148 (Propositional Object has_part Propositional Object)

The distinction is very clear to people who understand the CRM -- the part of the story where the hero lives happily ever after is part of many propositional objects (or plots), whereas the symbols that encode that plot point within a certain larger set of symbols (the book) is clearly not part of all those other stories.

The issue is that for most cases, both are simultaneously true, however one is not a subPropertyOf the other. The Information Object that represents the set of symbols that encodes the story within the book is both P106 and P148 to the Information Object of the book. The auction entry record in the auction catalog is both propositionally and symbolically part of the larger entity. And so on.

We don't (I strongly believe) want to always assert both P106 and P148 all the time. It's messy and no one beyond semantics junkies cares about the difference. Yes, "tea" is symbolically part of "team" but not propositionally ... but just don't do that then. We don't have a use case for strict symbolic subsets at arbitrary levels of granularity.

For consistency, we should pick which ever covers the most cases to avoid the gotcha of "Oh but for X class you need to use this other partitioning property...". And it's easy to pick, as Appellations are not Propositional, only Symbolic. Hence we should use P106 rather than P148. The only cases when P148 is required would be Propositional Object (rarely if ever used directly) and Right (already so semantically messed up as to be useless). All other classes that have PO as an ancestor also have Symbolic Object as an ancestor.

Habennin commented 6 years ago

I don't think that the distinction between P106 and P148 is merely for semantics junkies. :) It's actually just two properties that different types of scholars or the same scholar following. A really pleasant read about this comes from Umberto Eco in a short work on translation "Mouse or Rat" where he amusingly, as always, explores how translation cannot be a matching of symbols but is a recreation of a propositional content. The argument why they are good properties is that they pick our really existing distinct phenomena in the world (symbolics objects vs propositional content) that people really do want to investigate separately. Sometimes a scholar needs to research texts that would hold or be made up of some symbolic objects (maybe they are looking for potential completions to a partially legible text) and sometimes they want to find stories that have a certain propositional content (let's say Robin Hood regardless of spelling, character set, language). I was working with some Mayanists whose entire research question was around determing what groups of symbols could be associated with what propositions. So there are definitely real research groups (ones that would be affiliated to projects that would exhibit at museums like the Getty) that need and use such predicates.

The problem with the solution of let's just pick one then, is that in the open world, people using the same ontology would want to find the same things and if you use it idiosyncratically then that will be counter productive.

So, yes, a practical solution might be to say both (I'm not sure that this would cause any space issues in the world. It's just a few more bits.). That too might fail the test I suggested above though of 'just pick something' because maybe some of this information specifically IS NOT about symbols or propositions and then we get into trouble.

A potential ontological solution, though I can image headwinds, would if there a more general has part property in the concepts hierarchy... the problem with that would be to have some useful criteria for being able to pick out what is an instance. So if we just had a concept has part concept relation that played super class to these properties it would solve a practical problem of not knowing the kind of parthood. That being said, when can we say that something is and is not a part then. For the symbolic and propositional relations we have a criterion. Where would be the substance of the parthood of pure ideas?

azaroth42 commented 6 years ago

I agree that scholars might care, but I would say that the scholar that cares about this level of detail counts as a semantics junkie :) Most scholars, in my experience, are interested in the general content, not the modeling. Whether the auction catalog symbolically contains the description of the lot, or propositionally contains it doesn't matter to the art historian, they care about the document and the activities it describes.

The test for me is not whether a scholar can distinguish the cases, but whether a developer that needs to use the data to build something for the scholar to use can productively make use of the distinction. The sort of scholars that interact directly with LOD are both feet in the semantics junkie camp.

I agree that having some systems use P106 and some use P148 is undesirable from an interoperability point of view, both from a query perspective and from a usability perspective of the data directly. The fewer terms (classes, predicates, ontology terms) in use in the entire ecosystem, the more likely interoperable software is to be implemented. The trade off is how precise the semantics of the terms can be before it gets too much. We've heard that current CRM is "far too much" ... so where do we pare it back causing the least overall damage?

I would be very very happy with a super-property of all of the has_part like relationships, but I also foresee strong resistance from folks like Stephen and Martin that it's semantic "pollution" that might allow places to be said to be part of man made objects or other invalid assertions. (Ignoring the fact that in RDF you can always say that, it just confers the domain/range classes on to the subject/object of the triple)

I think the options are, therefore:

workergnome commented 6 years ago

From an implementation point of view, a potential issue with "doing both" is that it makes any hierarchical representation of the structure more confusing to represent.

{
    type: "InformationObject",
    id: "example:full_concept",
    P106_is_composed_of: [{
      id: "example:concept_part",
      type: "InformationObject",
      value: "..."
    },
    {
      id: "example:concept_other_part",
      type: "InformationObject",
      value: "..."
    }
],
    P142_used_constituent: ["example:concept_part", "example:concept_other_part"]
  }
}

Not a killer, but still awkward.

Conal-Tuohy commented 6 years ago

Can we discuss this is in the context of at least a few real and concrete examples (of information objects of various types being divided into parts, for various purposes)?

I have one from the NMA: "narratives" (so-called) which are information objects consisting of a small text (a few paras), plus a list of parts which are references to museum objects, or else are narratives themselves (recursively).

It seems to me off the top of my head that there's little or no value in dividing such an object into purely symbolic fragments, but rather that the P148 has component relation is the key structural relation (i.e. the hierarchical relationship of blocks of propositional content).

workergnome commented 6 years ago

Are you describing a text that is broken into smaller blocks of text (which I think is the P106 symbolic fragments) or a single block of text that asserts a proposition about the world, which is subdivided into more nuanced propositions (the P148, as I understand it?), or are you describing a scenario where there is a text that makes an proposition, and it can be decomposed into smaller fragments of text that contain their own propositions, each of which, when combined, make up the larger proposition (which is the P106 + P148 structure that kicked this off)?

Conal-Tuohy commented 6 years ago

Here's an example of our XML source data, which makes it a bit clearer. This first <record> has an identifier 1758 and contains (by reference) more records of the same type, of which I shown just one.

<record>
 <irn>1758</irn>
 <NarTitle>Western Arnhem Land</NarTitle>
 <AdmDateModified>08/07/2015</AdmDateModified>
 <AssMasterNarrativeRef>1757</AssMasterNarrativeRef>
 <NarNarrative>&lt;p&gt;This region is dominated by the Arnhem Escarpment, a vast range of rocky hills home to thousands of rock art galleries, some more than 30,000 years old. The majority of rock paintings here are figurative and these have influenced the style of bark painting in the region.&lt;/p&gt;</NarNarrative>
 <SubNarratives>
  <SubNarrative>
   <SubNarrative.irn>1762</SubNarrative.irn>
   <SubNarrative.title>Figures in the landscape</SubNarrative.title>
  </SubNarrative>
  <SubNarrative>
   <SubNarrative.irn>1763</SubNarrative.irn>
   <SubNarrative.title>Dynamic figures</SubNarrative.title>
  </SubNarrative>
  <SubNarrative>
   <SubNarrative.irn>1765</SubNarrative.irn>
   <SubNarrative.title>Murals</SubNarrative.title>
  </SubNarrative>
  <SubNarrative>
   <SubNarrative.irn>1764</SubNarrative.irn>
   <SubNarrative.title>The Nganjmirra Family</SubNarrative.title>
  </SubNarrative>
  <SubNarrative>
   <SubNarrative.irn>1761</SubNarrative.irn>
   <SubNarrative.title>The school of Yirawala</SubNarrative.title>
  </SubNarrative>
  <SubNarrative>
   <SubNarrative.irn>1760</SubNarrative.irn>
   <SubNarrative.title>Yirawala</SubNarrative.title>
  </SubNarrative>
 </SubNarratives>
 <MulMultiMediaRef_tab/>
</record>

One of the subnarratives referred to above:

<record>
 <irn>1761</irn>
 <NarTitle>The school of Yirawala</NarTitle>
 <AdmDateModified>18/01/2016</AdmDateModified>
 <AssMasterNarrativeRef>1758</AssMasterNarrativeRef>
 <DesType_tab>
  <DesType>Lake Disappointment</DesType>
 </DesType_tab>
 <NarNarrative>&lt;p&gt;The school of Yirawala developed as a natural outcome of artists sharing camps and outstations with members of their extended families. Such places contain artists’ studios that are out in the open and surrounded by the daily activities of the family - rather than being locked away in secrecy. The open-air studios provide opportunities for artists to work together and to influence each other. They are also where younger family members learn about art.&lt;/p&gt;&#13;
&lt;p&gt;Marrkolidjban outstation, in the Liverpool River region, played an important role in the history of the development of the artists tutored by Yirawala. It was here that &lt;a href=&quot;http://www.nma.gov.au/exhibitions/old_masters/artists/yirawala&quot;&gt;Yirawala&lt;/a&gt; and &lt;a href=&quot;http://www.nma.gov.au/exhibitions/old_masters/artists/curly_bardkadubbu&quot;&gt;Curly Bardkadubbu&lt;/a&gt;, both members of the Born clan, lived and worked in the early 1970s. Bardkadubbu learnt from Yirawala to paint on bark and, around the same time, Yirawala taught his own sister's son, &lt;a href=&quot;http://www.nma.gov.au/exhibitions/old_masters/artists/peter_marralwanga&quot;&gt;Peter Marralwanga&lt;/a&gt;, the intricacies of painting patterns of &lt;em&gt;rarrk&lt;/em&gt;. Marralwanga in turn taught his nephew &lt;a href=&quot;http://www.nma.gov.au/exhibitions/old_masters/artists/john_mawurndjul&quot;&gt;John Mawurndjul&lt;/a&gt; who, as a teenager, had been inspired by Yirawala.&lt;/p&gt;&#13;
&lt;p&gt;Note the similarities in these artists' renditions of Ngalyod the Rainbow Serpent and Namanjwarre the Estuarine Crocodile. In each case the sheer power of the creatures is expressed in the drawing of the figures as coiled springs, ready to snap and unleash their herculean powers.&lt;/p&gt;</NarNarrative>
 <ObjObjectsRef_tab>
  <ObjObjectsRef>
   <irn>145425</irn>
   <AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
  </ObjObjectsRef>
  <ObjObjectsRef>
   <irn>155978</irn>
   <AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
  </ObjObjectsRef>
  <ObjObjectsRef>
   <irn>145423</irn>
   <AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
  </ObjObjectsRef>
  <ObjObjectsRef>
   <irn>146851</irn>
   <AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
  </ObjObjectsRef>
 </ObjObjectsRef_tab>
</record>

So there are blocks of "propositions" in the top level "narrative" itself, as well as in the lower level narratives (and incidentally the ObjObjectsRef elements in the narrative about the Yirawala School are examples of how a narrative P67_refers_to specific artworks in the collection).

We definitely do have a requirement to be able to treat the the interior texts both as small texts in their own right, as well as merely subsections of the larger text.

So this seems to me to correspond to the notion of propositional objects. We don't have a use case in which they are broken down in any other way, as far as I know (i.e. in terms of arbitrary chunks of "symbols" such as unicode characters). I'm not sure what such a use case would look like actually.

Anyway, my feeling is that this example requires only E89 Propositional Objects and its mereological predicate P148 has component.

annabelleee commented 4 years ago

P106 is the default for ARM WG because it covers most of the cases. P148 is for exceptional cases when clearly referring to propositions.