w3c / data-shapes

RDF Data Shapes WG repo
87 stars 33 forks source link

use case: local parameterised messages #158

Open bertvannuffelen opened 2 months ago

bertvannuffelen commented 2 months ago

As suggested in the issue https://github.com/SEMICeu/DCAT-AP/issues/355 the contribution of a use case.

SHACL specifies that shacl:message overwrites the message generated by the engine.

Thus for the constraint

https://semiceu.github.io//DCAT-AP/releases/3.0.0#AgentShape/236f0210baaf149903750c43bbe7012c21debb2a> 
   rdfs:seeAlso "https://semiceu.github.io//DCAT-AP/releases/3.0.0#Agent.type";
  shacl:description "A type of the agent that makes the Catalogue or Dataset available."@en;
  shacl:maxCount 1;
  shacl:name "type"@en;
  shacl:path dc:type.

and the data

<https://test.com/id/agent/1221> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Agent> .
<https://test.com/id/agent/1221> <http://xmlns.com/foaf/0.1/name> "agent 1"@en .
<https://test.com/id/agent/1221> <http://purl.org/dc/terms/type> "type ORG A".
<https://test.com/id/agent/1221> <http://purl.org/dc/terms/type> "type ORG B" . 

The SHACL engine will result a precise message that indicates there are 2 values, instead of 1. " Property may only have 1 value, but found 2" .

If you perform the replacement shacl:message, the message will be: "Maximally 1 values allowed for type".

This happens even if you add a shacl:message in another language that is not provided by the engine. The overwrite will happen.

As the engine message is more detailed and better pinpointing the error, but we also want to support multilingual messages we added an extra property.

You can try it with the ITB testbed which uses the reference SHACL library as it backbone.

Testbed Instance: https://www.itb.ec.europa.eu/shacl/any/upload.

HolgerKnublauch commented 2 months ago

There is nothing in the SHACL spec that prohibits or discourages using multiple languages for sh:message. It is up the implementation to pick (or ignore) such values. In our servlets, we use the accepted language from the HTTP request to pick the most suitable language, but I believe for example if the ontology only declares messages with "nl" as language and "nl" is not among the accepted languages, it may fall back to the default message. The assumption is that the default message in English is better than a custom language in Dutch unless the receiver is actually able to understand Dutch.

(In your examples, note that the preferred namespace prefix for SHACL is "sh" and not "shacl".)

bertvannuffelen commented 2 months ago

The issue is that the spec does not provide a way how to get from the engine the parameters back to make correct and nice statements in any language, as illustrated.

So in the light of a EU multilingual context providing error messages in the language of the data owner is important. Therefore thus this workaround.

In addition, this also highlights that the error message returned from the engine is often not meaningfull for a data engineer. It needs the context from where this rule came from. Error messages that connect the two are needed in practice.

I see the following options:

HolgerKnublauch commented 2 months ago

Our process to record suggestions for SHACL 1.2 is to open issues at https://github.com/w3c/shacl/issues

This repo here is only for editorial changes to SHACL 1.0 and even those probably won't get done as the WG is closed.