SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
74 stars 24 forks source link

shacl - minor feedback on dcat-ap_2.0.0_shacl_shapes.ttl #126

Closed aidig closed 4 years ago

aidig commented 4 years ago

https://github.com/SEMICeu/DCAT-AP/blob/master/releases/2.0.0/dcat-ap_2.0.0_shacl_shapes.ttl

1) Line 264 - Small typo (copy/paste error) in (Correct name 'Category' to 'Checksum'):

:Checksum_Shape
    a sh:NodeShape ;
    sh:name "Category"@en ;

2) Line 413-415 - Mismatch in the cardinality of the DCAP-AP specification and the SHACL constraints. According to the specification the cardinality is 0..1, but this is not reflected in the constraints. (I would argue that the SHACL implementation is correct and the specification would have to be amended?)

    ], [
        sh:class skos:Concept ;
        sh:path dct:type ;
        sh:severity sh:Violation
    ], [
`
bertvannuffelen commented 4 years ago

@aidig

2) Line 413-415 - Mismatch in the cardinality of the DCAP-AP specification and the SHACL constraints. According to the specification the cardinality is 0..1, but this is not reflected in the constraints. (I would argue that the SHACL implementation is correct and the specification would have to be amended?)

No it is the inverse. The SHACL is not the master, the human readable document is. Personally as long the human readable document cannot be generated from the SHACL, the human readable document will be the master. That has always the broader context, which cannot be added (or even is best not to be added) to a machine readable representation.

I checked the above one. And indeed here the max-cardinality constraint is missing.

aidig commented 4 years ago

@bertvannuffelen Sure, the human readable specification document is normative and the SHACL implementation is not, In this case, my mismatch comment should be interpreted as a change suggestion for the specification document. Surely, it is not advisable to restrict the use of dct:type as datasets can be categorised in terms of many different aspects?

bertvannuffelen commented 4 years ago

@aidig I think the specification has dropped this link to the desired conceptscheme:

https://op.europa.eu/en/web/eu-vocabularies/at-dataset/-/resource/dataset/dataset-type

I remember that the property has been introduced to give some more detail about the nature of the data in the dataset, but when I see the list today I am not so sure if the intend and usage is still matching. And probably the cardinality can be then indeed a topic of discussion. I propose for the content discussion to create a new issue. I will resolve the mismatch with the document as part of the bugfix. The outcome of the discussion on dct:type usage is then part of the next release.

aidig commented 4 years ago

Indeed, regarding the property "dct:type", the normative specification reads:

"This property refers to the type of the Dataset. A controlled vocabulary for the values has not been established."

Falling back to the original vocabulary, it is specified in DCAT 2.0 that

"It is also possible for multiple classifications to be present in a single description".

Hence the cardinality should not be 0..1 for this property.

Example 8

:dataset-001
  rdf:type  dcat:Dataset ;
  dct:type  <http://purl.org/dc/dcmitype/Text> ;
  dct:type  <http://id.loc.gov/vocabulary/marcgt/man> ;
  dct:type  <http://registry.it.csiro.au/def/datacite/resourceType/Text> ;
  dct:type  <http://registry.it.csiro.au/def/re3data/contentType/doc> ;
.

<http://registry.it.csiro.au/def/datacite/resourceType/Text>
  rdfs:label "Text"@en ;
  dct:source "DataCite resource types"@en ;
  .

<http://registry.it.csiro.au/def/re3data/contentType/doc>
  rdfs:label "Standard office documents"@en ;
  dct:source "Re3data content types"@en ;
aidig commented 4 years ago

@bertvannuffelen wrote:

Personally as long the human readable document cannot be generated from the SHACL, the human readable document will be the master.

Actually, generating the human readable document from the SHACL implementation is exactly what we are trying out for an updated Danish Application Profile of DCAT-AP. And although we are still in a development phase, it does present benefits such as minimzing syncronisation efforts :-) Basically, the contents of Chapter 4 in this specification document is a conversion of a SHACL file into markdown. (Note that the specification is represented primarily in Danish and is under development still).

init-dcat-ap-de commented 4 years ago

Very interesting! Reminds me of model driven architecture. In Germany this is used for designing XML standards for the administration (xoev.de). But it brings you in the same position: you can't use the shapes to validate most data, since most data doesn't include that a http://xmlns.com/foaf/0.1/hompage is a http://xmlns.com/foaf/0.1/Document