dbpedia / databus

A digital factory platform for managing files online with stable IDs, high-quality metadata, powerful API and tools for building on data: find, access, make interoperable, re-use
Apache License 2.0
36 stars 16 forks source link

Make version SHACL shapes less restrictive #167

Open holycrab13 opened 4 months ago

holycrab13 commented 4 months ago

SHACL shapes are currently very restrictive, rejecting inputs with fields that could be auto-completed. Currently SHACL validation is executed after auto-completion, so this is currently not a problem by itself.

However, SHACL validation should be the first thing to execute since it can also reliably detect malformed RDF. Additionally, this could be more user-friendly since users can validate their non-autocompleted inputs against the SHACL resources at /res/shacl.

Task is to make anything auto-completeable optional in the SHACL shapes

JJ-Author commented 4 months ago

yeah we have to be a bit careful here. i think its best have a shacl library with all the tests and then one master shape with some optional values as you describe applied before autocompletion and one master shape after autocompletion where values are required (to verify that this does not mess up sth).

JJ-Author commented 4 months ago

another (probably easier) idea by me discussed in todays meeting was the following

this way we can solve 2 issues for one: we can do easier documentation and can display it easily in the docu for every property how it will be autocompleted.

JJ-Author commented 4 months ago

an example to the proposal from my first message. it is more effort but it is possible to define more fine-grained and helpful error messages as shown in the toy example below. Note the error message describe error causes that explain violation causes and not the test logic. We shoud consider that for all test messages. the general idea is for properties hat are autocompleted we remove the mincount but add another shape in a dedicated file where we include all the min:counts for autocompleted values that we use for the backend.

:OptionalPartByteSizeShape
    a sh:PropertyShape ;
sh:targetClass databus:Part ;
    sh:path dcat:byteSize ;
    sh:maxCount 1 ;
    sh:datatype xsd:integer;
    sh:message "A databus:Part dcat:byteSize is not of type xsd:integer or has MORE than one value"@en .

:AfterAutocompletionPartShape
    a sh:NodeShape ;
    sh:targetClass databus:Part ;
    sh:property [                         # with sh:and instead we can enumerate all autcompleted properties for databsu:Part here
        sh:path dcat:byteSize ;
        sh:minCount 1 ;  # Making it mandatory
        sh:message "A databus:Part is missing the dcat:byteSize value"@en ;
    ] .

Note the maxCount is not checked correctly in shaclplayground (but here it works https://rdfplayground.dcc.uchile.cl/)