Open kcoyle opened 2 years ago
Will this section of the Cookbook provide guidance on sh:message
for providing more context on sh:severity
violations and warnings? Or is there somewhere else where this information is available? I've seen open related issues/comments, so maybe this hasn't been decided.
@sfolsom we might cover how to define severity and a message in extensions to DCTAP (e.g. as extra columns in the table) so that they can be encoded SHACL; but they will be suggestions only, nothing about converting to SHACL will be normative.
@sfolsom We are always looking for examples that folks can relate to. "Real" examples tend to be too complex, but if you can tell/send a reduced example from your work that however "looks real" that would be appreciated. It doesn't have to be in code - a use case would be great. Thanks.
Thanks for the response. The idea of keeping SHACL-specific work non-normative makes sense.
To provide context to the question, I'm part of a PCC group thinking about interoperability of BIBFRAME data, and we're starting to define shapes, and thinking about how to have meaningful validation reports that include more than pass/fail violations. There's a number of properties that we have designated as "required if applicable" where a sh:Violation or sh:Warning with a sh:resultMessage would be useful.
I have another question about extending DCTAP to support SHACL validation. We have a use case where we want to define a target of a shape as entities that use a specific property where the object of that property is a specific type. For example we need two shapes... 1.) A shape for bf:Electronics that are bf:instancesOf works that are typed as bf:Monograph 2.) A shape for bf:Electronics that are bf:instancesOf works that are typed as bf:Serial.
I noticed https://github.com/philbarker/TAP2SHACL/blob/main/examples/SHACLPerson/shapes.csv as an implementation of an extension for SHACL, and I was wondering if the cookbook for SHACL might eventually include a type vocabulary for things like class, objectsOf, subjectsOf, SPARQLTarget. I snuck SPARQLTarget in there :) because I can't think of another what to define the to shapes for bf:Electronics without using sh:SPARQLTarget.
If there is an easier way to define these types of targets using a simpler DCTAP implementation/extension, I'd be grateful for the guidance.
@sfolsom we always felt that there would be things that can be done in languages like SHACL and ShEx that would go beyond what could be covered in a simple tabular format. I don't think anyone wants DC TAP as a competitor to those standards, so we deliberately have avoided creating it as such.
TAP2SHACL already goes beyond what can be done with a TAP alone, for example with the node shape targets. Currently there are no plans to extend it to cover the SHACL Advanced Features.
Sorry, I wasn't clear. By "extending", I wasn't suggesting DCTAP would expand formally to include more complicated use cases. I was wondering if the cookbook for DCTAP to SHACL might make suggestions or point to suggestions/implementations made elsewhere who have extended DCTAP that go beyond DCTAP's scope. I'm struggling to find other initiatives attempting to implement DCTAP together with SHACL for validation of RDF.
@sfolsom Is there any chance you can share all or part of your DCTAP? At least a part where you run into this problem. I'm hoping that having more context will help me think about this ;-).
@kcoyle, here's a copy of spreadsheet for serial electronic bf:Instances, where we have a target of bf:Electronic: https://docs.google.com/spreadsheets/d/13CU7B-RoLTIVgnZWt68PrEjcqVYfuNdvwBkw6xsb-zk/edit?gid=66557658#gid=66557658.
Here's one for monograph electronic bf:Instances that also uses bf:Electronic for targets: https://docs.google.com/spreadsheets/d/19R-ZbA0as-EPWKnGvP3dkncN7yX5n6_Q18-meLGS_tQ/edit?gid=66557658#gid=66557658.
Using classes as targets works when we want the same validation tests every time we would come across a given class, but we expect some different properties when the bf:Electronic is for a monograph vs. serial. I realize this is beyond what DCTAP is set up to handle, but I thought I'd raise it here since you all have considered the boundaries and potential mappings between DCTAP and SHACL.
I may be wrong, but the only way I could find in SHACL to define conditional targets like this was to use sh:SPARQLTarget and have queries like...
Serial Electronic Target
big:Serial:Instance:Electronic
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes bf: ;
sh:select """
SELECT ?SerialElectronic
WHERE {
?SerialElectronic a bf:Electronic .
?SerialElectronic bf:instanceOf ?Serial .
?Serial a bf:Serial .
}
""" ;
] ;
Monograph Electronic Target
big:Monograph:Instance:Electronic
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes bf: ;
sh:select """
SELECT ?MonographElectronic
WHERE {
?MonographElectronic a bf:Electronic .
?MonographElectronic bf:instanceOf ?Monograph .
?Monographic a bf:Monograph .
}
""" ;
] ;
It would be odd to store SPARQL Queries in the DCTAP, but maybe that's what we have to resort to.
@sfolsom I thought I answered this but it got lost somehow.
We did talk about storing code in DCTAP cells, but there is a very good chance it would get mangled in the CSV format. If you find a way to make that work I see no reason why not. We also discussed that one could create a file of the needed code and store it with an IRI, then place that IRI in the DCTAP, adding column. This gets around the CSV problems, but adds a level of complexity for processing. (Note: reusing columns for new functions probably is a road to confusion. Creating columns for specific types of data is a better idea.)
I assume that you are "limited" to actual Bibframe in your instance data but if the bf:Electronic validation rule is different for Monographs and Serials doesn't that imply that subclasses of bf:Electronic for MonographElectronic and SerialElectronic are needed? Would they resolve this problem?
@kcoyle, thanks for this history and insight. I had a similar train of thought about the classes... new classes or if electronic-ness and print-ness should/could be traits of classes like bf:SerialInstance and bf:MonographInstance. As you anticipated though, we're limited to existing BF classes.
We're taking inspiration from Phil's "target" column extension, and might have to have a column that corresponds to sh:SPARQLTarget, and as you said, figure out how to get from the DCTAP to where we're storing the actual SPARQL.
The Cookbook needs a section filled in on converting a DCTAP to SHACL.