GovDataOfficial / DCAT-AP.de-SHACL-Validation

SHACL-Shapes für DCAT-AP.de
https://www.itb.ec.europa.eu/shacl/dcat-ap.de/upload
GNU Affero General Public License v3.0
10 stars 7 forks source link

German addition dcat-ap-spec-german-additions.ttl is does not comply to SHACL #19

Closed volkerjaenisch closed 1 year ago

volkerjaenisch commented 1 year ago

Dear Govdata!

At first a lots of thanks to GovData and ]init[ to provide us with a set of SHACL rules to make DCAT-things better.

We in BBG are currently including these SHACLE rules to come up with a replacement for our homebrew DCAT validator.

Our OpenData Portal the "Datenadler" utilizes Python. Python is quite common in OpenData since CKAN is also Python based.

There is just one SHACL Library in Python: pySHACL, which all the CKAN people may be using if CKAN will utilize SHACL.

Our code supplied with (All files from this GH repository):

validates our DCAT data quite well. The validation points out violations and such in the given DCAT-RDF data. Normal behavior.

But when we include additionally the german shape files dcat-ap-spec-german-additions.ttl and dcat-ap-spec-german-messages.ttl then pySHACL informs us with an exception:

"Shacl File does not validate against the Shacl Shapes Shacl file" with a stacktrace a kilometer long.

This exception arouses since pySHACL checks the SHACL shapes before using them.

In this it was not clear for me where the cause of the problem is:

Having filed already one bugreport to pySHACLE today the first blame was to PySHACL.

But at the end of the day I was not wiser. Desperately I just put the shape files into the European shape validator:

https://www.itb.ec.europa.eu/shacl/shacl/upload

and it produces a comparable number and detail problems with the shape files as pySHACL delivers.

The message file is OK, but dcat-ap-spec-german-additions.ttl produces 21 errors.

Now I think that the German shapes are not in shape.

The only code that I have found concerning DCAT and SHACL is JAVA based. @]init[ You are using JAVA, I bet? Have you ever tried an other programming language with your shapes? I can try and and run the shapes on our RDF4J DB to get more platforms into play.

Let us work together to supply the BRD with a SHACL validation anyone can use.

Cheers Volker

volkerjaenisch commented 1 year ago

I tracked part of the problem down.

With some help of the pySHACL community. https://github.com/RDFLib/pySHACL/issues/168#issuecomment-1323034114

:Dataset_dcat_theme_v_List
    sh:description "Todo: Wenn zutreffend, so umbauen, dass lediglich wenigstens einmal das Vokabular genutzt wurde. Dann wird auch :Dataset_dcat_theme_v_IRI zusätzlich benötigt." ;
    sh:path dcat:theme ;
    sh:node [
        sh:path skos:inScheme ;
        sh:hasValue <http://publications.europa.eu/resource/authority/data-theme> ;
    ] ;
    sh:severity sh:Violation ;
    sh:message "dcat:Dataset: dcat:theme MUSS eine IRI aus diesem Vokabular verwenden: https://www.dcat-ap.de/def/dcatde/2.0/spec/#kv-data-theme"@de ;
.

Problem is the node shape

    sh:node [
        sh:path skos:inScheme ;
        sh:hasValue <http://publications.europa.eu/resource/authority/data-theme> ;
    ] ;

According to W3C https://www.w3.org/TR/shacl/#NodeConstraintComponent

sh:node | The node shape that all value nodes need to conform to. The values of sh:node in a shape must be well-formed node shapes. And for node-shapes it is stated that:

SHACL instances of sh:NodeShape cannot have a value for the property sh:path.

So sh:path skos:inScheme ; in a sh:node is a violation of the SHACL syntax.

Cheers Volker

init-dcat-ap-de commented 1 year ago

Hello,

thank you for this analysis! I've come to a similar conclusion, even though I hadn't found the reason for the problem: grafik

There might be a second problem with or (...) shapes, but I have to look into this.

Unfortunately I can't give a timeframe until when the shapes are fixed, but it has a high priority.

Cheers Ludger

volkerjaenisch commented 1 year ago

@init-dcat-ap-de Thanks for digging into this. I would like to support you. If you update the repository frequently I will check for further problems. Cheers, Volker

init-dcat-ap-de commented 1 year ago

I pushed updates to improve the conformity. Unfortunately I couldn't find a solution for the sh:or problem. I created an issue here: https://github.com/w3c/data-shapes/issues/147

volkerjaenisch commented 1 year ago

Any news on the original problem of this issue?

init-dcat-ap-de commented 1 year ago

Should be fixed.

volkerjaenisch commented 1 year ago

Then I advice to open a new issue for the "OR problem". When you then there elaborate a bit on the OR problem I am probably able to support you.