dcmi / dctap

DC Tabular Application Profile
https://dcmi.github.io/dctap/
34 stars 10 forks source link

same property in shape #76

Open kcoyle opened 2 years ago

kcoyle commented 2 years ago

An early example that we were given was the situation in which a profile specifies that a property can be either a literal or an IRI. This could be done by making the valueNodeType "IRI LITERAL" for either. However, this does not allow one to place different constraints on IRIs and LITERALs in that row. I suggested something like:

shapeID shapeLabel propertyID propertyLabel mandatory repeatable valueNodeType valueDataType valueConstraint valueConstraintType valueShape
bookShape Book dct:subject Subject TRUE TRUE subjectShape
subjectShape Subjects dct:subject Subject FALSE TRUE literal xsd:string
subjectShape Subjects dct:subject Subject FALSE TRUE IRI http://id.loc.gov/ IRIstem

This was immediately noted as problematic as it has the property dct:subject having as its object a shape that itself has dct:subject as a property. I can imagine similar situations with dct:creator -- although in that case we have created examples using foaf:name.

Is the use of properties in this kind of arc chain a problem? If so, 1) what is the problem? and 2) what is a better solution?

philbarker commented 2 years ago

I think that the use of a shape (subjectShape) is the problem. So far all the shapes we have discussed have been what SHACL calls node shapes [1 https://www.w3.org/TR/shacl/#node-shapes], i.e. the describe the arcs that you would expect to see around a node in the graph. Subject shape doesn't do that, it describes the node you would expect to see at the end of an an arc in the graph -- in SHACL this would be a property shape [2 https://www.w3.org/TR/shacl/#property-shapes].  In the TAP our equivalent to a SHACL property shape is (roughly speaking) a row.

I don't think you need a subjectShape to do what you are trying to do, I think this works:

shapeID propertyID propertyLabel mandatory repeatable valueNodeType valueDataType valueConstraint valueConstraintType bookShape dct:subject Subject TRUE TRUE literal IRI

bookShape dct:subject Subject FALSE TRUE literal xsd:string

bookShape dct:subject Subject FALSE TRUE IRI
http://id.loc.gov/ IRIstem

Read as:

  1. whatever is described by a book shape must have the dct:subject property with a value that is a Literal or an IRI

  2. whatever is described by a book shape may have a dct:subject property with a value that is a Literal so long as the Literal is formatted as an xsd:string

  3. whatever is described by a book shape may have a dct:subject property with a value that is an IRI so long as the IRI stem begins with http://id.loc.gov/

Even though it doesn't introduce any new "rules" for interpreting the TAP beyond what we cover in the primer (i.e. all of a row must be matched; multiple entries in a cell are alternives; rows are independent of each other), I think this is better placed in the cookbook than primer.

Phil

  1. https://www.w3.org/TR/shacl/#node-shapes

  2. https://www.w3.org/TR/shacl/#property-shapes

On 02/08/2022 20:31, Karen Coyle wrote:

An early example that we were given was the situation in which a profile specifies that a property can be either a literal or an IRI. This could be done by making the valueNodeType "IRI LITERAL" for either. However, this does not allow one to place different constraints on IRIs and LITERALs in that row. I suggested something like:

shapeID shapeLabel propertyID propertyLabel mandatory repeatable valueNodeType valueDataType valueConstraint valueConstraintType valueShape bookShape Book dct:subject Subject TRUE TRUE

subjectShape subjectShape Subjects dct:subject Subject FALSE TRUE literal xsd:string

subjectShape Subjects dct:subject Subject FALSE TRUE IRI
http://id.loc.gov/ IRIstem

This was immediately noted as problematic as it has the property |dct:subject| having as its object a shape that itself has |dct:subject| as a property. I can imagine similar situations with |dct:creator| -- although in that case we have created examples using |foaf:name|.

Is the use of properties in this kind of arc chain a problem? If so, 1) what is the problem? and 2) what is a better solution?

— Reply to this email directly, view it on GitHub https://github.com/dcmi/dctap/issues/76, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFAU7YV3TSPJ4UWRW3TLNDVXFZQDANCNFSM55MJSJHA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Phil Barker http://people.pjjk.net/phil. http://people.pjjk.net/phil CETIS LLP https://www.cetis.org.uk: a cooperative consultancy for innovation in education technology. PJJK Limited https://www.pjjk.co.uk: technology to enhance learning; information systems for education.

CETIS is a co-operative limited liability partnership, registered in England number OC399090 PJJK Limited is registered in Scotland as a private limited company, number SC569282.

kcoyle commented 2 years ago

@philbarker Thank you. A very interesting solution.

This is a very important point and one that we have not actually discussed:

describe the arcs that you would expect to see around a node in the graph

In our examples we have used shapes that are what I would call "logic shapes" - and this is an example of that. Another one is the use of ((foaf:name) OR (foaf:givenName AND foaf:familyName)). You have shown a different (and perhaps better) way to express this logic, although it may be harder to explain. The questions that I see are:

  1. Does TAP define the structure of the metadata? OR
  2. Does TAP define the concepts in the metadata?

For the first one, I wonder if we can reasonably expect the profile developers to know the underlying structure. For the second, do we expect that the concepts and the structure will be compatible? Let's discuss this in a meeting.

kcoyle commented 2 years ago

Close, as per discussion of Aug 4, 2022. Minutes