Open tpluscode opened 1 year ago
Shapes derived from the mapping don't necessarily describe the output graph of the pipeline, often there are post-processing steps after the mapping.
Nevertheless, there are likely cases for which shapes derived from the mapping are useful (maybe also for troubleshooting pipelines or the mapping itself by validating intermediate results).
Some things to consider, if shapes are derived from the mapping (in general, not related to the proposal in PR https://github.com/zazuko/rdf-mapping-dsl/pull/126 ... more of a "notes-to-self"):
types
would result in a shape targeting multiple classessh:closed
individually)(Unrelated to this feature request, but related to the last point of the above list) Decoupling the mapping from the schema by means of pointing from the mapping to shape elements, rather than schema elements could be an option to facilitate handling schema changes (shape-first, shape-as-contract).
My plan is to make xrm more hackable, in order to unlock possibilites for toolchain improvements outside of the xrm editor itself. Like #127 and #128
For one-time scaffolding, introspecting the shapes from the output graph of the pipeline might be an alternative.
Here's a query to illustrate this, based on the construct query that SPEX is running in "introspection" mode. I used this in a customer project.
Note: The query has dependencies on spif:
functions which GraphDB has built-in. They need to be replaced for running the query on other stores.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mobi: <https://schema.mobicorp.ch/>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX schema: <http://schema.org/>
PREFIX spif: <http://spinrdf.org/spif#>
CONSTRUCT {
?nodeShape a sh:NodeShape .
?nodeShape sh:targetClass ?cls .
?nodeShape sh:property ?propertyShape .
?propertyShape a sh:PropertyShape .
?propertyShape sh:path ?property .
?propertyShape sh:class ?linktype .
?propertyShape sh:datatype ?datatype .
} WHERE {
VALUES ?cls {
# mobi:Table
# mobi:Column
mobi:Mitarbeiter
mobi:Organisationseinheit
}
?subject a ?cls .
?subject ?property ?object .
OPTIONAL {
?object a ?linktype .
}
MINUS {
# --- blacklist ---
VALUES ?cls {
rdf:Property
owl:TransitiveProperty
owl:SymmetricProperty
rdf:List
rdfs:Class
rdfs:Datatype
rdfs:ContainerMembershipProperty
# -------------
mobi:ArchitektursichtElement
mobi:OrganisationsElement
mobi:ProzessElement
mobi:FunktionsElement
mobi:IntegrationsElement
mobi:InformationsElement
# -------------
mobi:Informationsobjekt
mobi:Informationsobjektbeziehung
mobi:Informationsattribut
mobi:Rollenbesetzung
# -------------
mobi:edc\/UiView
mobi:edc\/Link
sh:PropertyShape
skos:ConceptScheme
skos:Concept
}
?subject a ?cls .
}
BIND(DATATYPE(?object) AS ?datatype)
BIND(spif:buildURI("<urn:NodeShape:{?1}>", spif:encodeURL(str(?cls))) AS ?nodeShape)
BIND(spif:buildURI("<urn:PropertyShape:{?1}/{?2}>", spif:encodeURL(str(?cls)), spif:encodeURL(str(?property))) AS ?propertyShape)
}
Shapes derived from the mapping don't necessarily describe the output graph of the pipeline, often there are post-processing steps after the mapping.
Yes, I realised that too while thinking about my proposal. In museumplus it is just like that. The XRM is only temporary representation and has nothing in common with the final representation.
Maybe I did not mention that precisely, but my idea was that shapes defined in XRM could also be unrelated to the mapping itself.
-node-shape PersonNodeShape from PersonMapping {
+node-shape PersonNodeShape {
}
That way one could take advantage of a simpler syntax although that would be slightly incomplete without nice support for vocabularies (re #14).
My plan is to make xrm more hackable
I cannot really comment on that but I'm intrigued about how hackability helps. Let's discuss that
See also https://github.com/RMLio/RML2SHACL
Paper: RML2SHACL: RDF Generation Is Shaping Up https://lirias.kuleuven.be/retrieve/641696
CC @BenjaminHofstetter
I would like to propose a new feature where minimal SHACL shapes are generated from the mappings. The purpose is to generate a starting point for defining more specific constraints over the output data. For example, given the mapping shown in the language reference
One would be able to produce a shape with minimal constraints.
It's important property shapes are named nodes, so that they would be extendable by adding properties in a separate document and merging them. Give multiple mappings for same predicate might require
sh:or
or different node kind such assh:NamedNodeOrLiteral
To implement this feature, I would propose to slightly adapt (and also simplify) the feature proposed in #115. I will create a draft PR to illustrate