rdfs:domain in cidoc.xml causing (undesired?) multiple inheritance in the resulting graph

beaudet commented 2 years ago

The linked art ontology file cidoc.xml contains the following property definition for "language".

<rdf:Property rdf:about="P72_has_language">
    <rdfs:label xml:lang="fr">est en langue</rdfs:label>
   <rdfs:label xml:lang="ru">&#1080;&#1084;&#1077;&#1077;&#1090; &#1103;&#1079;&#1099;&#1082;</rdfs:label>
    <rdfs:label xml:lang="en">has language</rdfs:label>
    <rdfs:label xml:lang="de">hat Sprache</rdfs:label>
    <rdfs:label xml:lang="el">&#941;&#967;&#949;&#953; &#947;&#955;&#974;&#963;&#963;&#945;</rdfs:label>
    <rdfs:label xml:lang="pt">&#233; da l&#237;ngua </rdfs:label>
    <rdfs:label xml:lang="zh">&#20351;&#29992;&#35821;&#35328;</rdfs:label>

    <rdfs:comment>This property describes the E56 Language of an E33 Linguistic Object.
Linguistic Objects are composed in one or more human Languages. This property allows these languages to be documented.
</rdfs:comment>
    <rdfs:domain rdf:resource="E33_Linguistic_Object"/>
    <rdfs:range rdf:resource="E56_Language"/>
<owl:inverseOf rdf:resource="P72i_is_language_of"/>
</rdf:Property>

When the JSON-LD data below is loaded into a graph, the rdfs:domain statement above seems to add the class E33_Linguistic_Object to the "Place" subject having the "language" predicate.

This makes it impossible to write a SHACL validator that checks the classes that are objects of the P72_has_language predicate since all of them are automatically assigned E33_Linguistic_Object when that predicate is present. Removing the rdfs:domain prevents this behavior. Are there any downsides to removing rdfs:domain in such cases? Shall we do that in the interest of proceeding with more robust data validation or will it have worse unintended consequences?

{
  "@context": "https://linked.art/ns/v1/linked-art.json",
  "id": "https://linked.art/example/object/0",
  "type": "HumanMadeObject",
  "_label": "Mona Lisa",
  "referred_to_by": [
    {
      "type": "Place",
      "language": [
        {
          "id": "http://vocab.getty.edu/aat/300388277",
          "type": "Language",
          "_label": "English"
        }
      ],
      "classified_as": [
        {
          "id": "http://vocab.getty.edu/aat/300435416",
          "type": "Type",
          "_label": "Description",
          "classified_as": [
            {
              "id": "http://vocab.getty.edu/aat/300418049",
              "type": "Type",
              "_label": "Brief Text"
            }
          ]
        }
      ],
      "content": "This portrait was doubtless started in Florence around 1503. It is thought to be of Lisa Gherardini, wife of a Florentine cloth merchant ..."
    }
  ]
}

beaudet commented 2 years ago

potential solution: turn off inferencing for validation - we don't rely on it in the model anyway. Or remove the statement from the model after loading the model. Either way, the validation should fail if assigning language to a place.

beaudet commented 2 years ago

We would need the triple: primary term is a member of the primary term classification set and likewise primary name classification set. Related to #460 and should be resolved by the approach to types - the same term can be in multiple sets.

azaroth42 commented 7 months ago

Yeah, I would turn off inferencing for this sort of thing ... but that might also break other domain/range uses? Agree that the validation should fail, of course :)

linked-art / linked.art

rdfs:domain in cidoc.xml causing (undesired?) multiple inheritance in the resulting graph #462