ISAITB / validator-resources-dcat-ap

Validator resources for the DCAT-AP RDF validator.
1 stars 2 forks source link

foaf:Organisation should be accepted for foaf:Agent #1

Open sabinem opened 3 years ago

sabinem commented 3 years ago

I used the https://www.itb.ec.europa.eu/shacl/dcat-ap/upload to validate the following input against DCAT-AP version 2.0.0:

@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://swisstopo/catalog-endpoint.rdf>
  a dcat:Catalog ;

  dcat:dataset <https://swisstopo/data/123>, <https://swisstopo/data/345> ;

  dct:title "Open Data City of Zurich"@en ,
            "Offene Daten der Stadt Zurich"@de ;
  dct:description "Datenkatalog der Stadt Zurich"@de ;

  dct:issued "2021-01-07T00:00:00"^^xsd:dateTime ;
  foaf:homepage <https://swisstopo/index.html> ;
  dct:publisher <https://swisstopo> ;
  dct:modified "2021-01-07T00:00:00"^^xsd:dateTime ;
  dct:language: "en" .

<https://swisstopo/data/123>
  a dcat:Dataset ;
  dct:title "some title"@en ;
  dct:description "some dataset"@en .

<https://swisstopo/data/345>
  a dcat:Dataset ;
  dct:title "some title"@en ;
  dct:description "some dataset"@en .

<https://swisstopo/index.html>
  a foaf:Document .

<https://swisstopo>
  a foaf:Organization   ;
  foaf:name "Landesamt für Topography"@en .

I got the following error, which I struggle to understand:

As far as I understand it foaf:Organization is a subclass of foaf:Agent, see here: http://xmlns.com/foaf/spec/#term_Organization So can you please help me to understand, why my catalog does not validate. Is that a bug in the validation tool or how else should I set up my foaf:Agent triples?

costas80 commented 3 years ago

Hi @sabinem , this is not a bug in the validator but rather an issue with the configuration of the SHACL shapes used for the validation. The problem is that the configuration for DCAT-AP needs to also include foaf (internally) to provide the shapes with the knowledge that an Organisation is a sublass of Agent.

This is a simple issue to correct but I am not responsible for the DCAT-AP configuration. I will signal this to the developers responsible for the configuration.

sabinem commented 3 years ago

@costas80 Thank you very much for your reply, so that is good to know, that my foaf:Agent wasn't malformed. Thanks for passing this on to the developers. It would be very helpful if the validator would get more reliable on the errors it points out.

costas80 commented 3 years ago

@sabinem , by the way this is something that we come across often when using SHACL for validation. In the end its best that validators define as part of their internal shapes' definition any vocabularies that are pertinent to the shapes' correct functioning. Foaf and validating subclasses of Agent has actually come up a few times already.

bertvannuffelen commented 3 years ago

@sabinem you hitted a complicated story of SHACL.

An explanation with similar examples can be found at https://github.com/SEMICeu/DCAT-AP/blob/2.1.0-draft/releases/2.1.0/examples/examples.md

In short, SHACL as part of its specification will do subclass inference if it has knowledge about the subclass hierarchy. Now by reusing terms from external vocabularies to create a specification, one faces the import challenge. Do I mean with the reuse of the term from a vocabulary also the import of all terms defined in the same namespace. And do I mean with this reuse the import of all constraints of defined in this namespace? You might be tempted to say yes to these answers based on the Foaf experience, however if you apply it on dct:terms then your dct:language must be a URI being part of the class dct::LinguisticSystem. Maybe your are not so happy with adding that statement each time. In the DCAT-AP wg this discussion on what should be validated and what not has resulted in the creation of multiple variants of the validation rules.

I uploaded your example to https://gist.github.com/bertvannuffelen/505faddaaaf15a9ecb45c605f8183563/raw/3e13e46b79d80ae08a0dc154fe38d384de504aeb/itb1.ttl and fixed the small typo for language (the colon).

Then you can validate it using the following variants:

In the configuration ranges zero the difference with ranges base is the import of the external vocabularies. (Imports are a discussion in their own right.)

This approach leaves a lot of freedom to decide whether or not you feel comfortable with the validation outcome. This outcome has to be placed in a data exchange flow, and what the consumers of the data are expected to do. For instance, in the foaf case one can expect that consumers infer this is an agent, and that this is done mechanically. Is this they to be executed by the publisher of the catalog? or by the consumer of the catalog? These are business questions beyond simple validation, and since there are many different opinions on this, the approach for the validation is to highlight it with the different variants. Using these variants you can express what kind of reasoning you expect consumers to do on your data to come close to their consumer view on DCAT-AP.