Closed jsheunis closed 3 weeks ago
A blocker for this is that LinkML's shaclgen
has several shortcomings in terms of propagating ranges and annotations to SHACL shapes. See e.g. https://github.com/psychoinformatics-de/shacl-vue/issues/9, and https://github.com/linkml/linkml/issues/1618. I'm first focusing on patching shaclgen...
Current status:
nodeKind
and datatype
information flow through to SHACL for slots with a custom type (or any type) as rangeThe current challenge is how to interpret the types of the annotation tag and value. For example if we have a custom type (or could also be a slot) with an annotation:
types:
NameString:
typeof: string
uri: myschema:NameString
pattern: "^[^\\n]$"
description: ...
annotations:
dash:singleLine: true
Here the annotation tag
is a CURIE and the value
is xsd:boolean
. But how would the shaclgen code know this? Annotations could be anything.
Other TODOs:
pattern
or annotations
for a single slot that originate both from the slot definition as well as the definition of the range type.I was able to amend the code to let nodeKind and datatype information flow through to SHACL for slots with a custom type (or any type) as range
has been turned into a PR: https://github.com/linkml/linkml/pull/2102
Here the annotation tag is a CURIE and the value is xsd:boolean. But how would the shaclgen code know this? Annotations could be anything.
I added code to shaclgen
to:
--include-annotations
so that the user can specify whether they want annotations to be part of the generated SHACL shapes:
type
and, in addition to python bool
, several types used in linkml:
from linkml_runtime.utils.yamlutils import (
extended_float,
extended_int,
extended_str,
)
This works quite nicely. Example:
Schema:
id: https://example.org/test-schema
name: myschema
prefixes:
dash: http://datashapes.org/dash#
dlco: https://concepts.datalad.org/
myschema: https://example.org/test-schema/
sh: http://www.w3.org/ns/shacl#
default_prefix: myschema
imports: https://w3id.org/linkml/types
types:
NameString:
typeof: string
uri: myschema:NameString
pattern: "/^[a-z ,.'-]+$/i"
annotations:
dash:singleLine: true
dash:editor: dash:TextFieldEditor
slots:
my_attr:
range: NameString
annotations:
dash:singleLine: false
sh:group: dlco:NamePropertyGroup
sh:order: 0
mydate:
range: date
annotations:
sh:group: dlco:NamePropertyGroup
dash:editor: dash:DatePickerEditor
sh:order: 1
mydatetime:
range: datetime
annotations:
dash:editor: dash:DateTimePickerEditor
sh:group: dlco:NamePropertyGroup
sh:order: 2
classes:
MyClass:
slots:
- my_attr
- mydate
- mydatetime
code to run:
gen-shacl --include-annotations myschema.yaml
Output:
@prefix dash: <http://datashapes.org/dash#> .
@prefix dlco: <https://concepts.datalad.org/> .
@prefix myschema: <https://example.org/test-schema/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
myschema:MyClass a sh:NodeShape ;
sh:closed true ;
sh:ignoredProperties ( rdf:type ) ;
sh:property [ dash:editor dash:TextFieldEditor ;
dash:singleLine false,
true ;
sh:datatype myschema:NameString ;
sh:group dlco:NamePropertyGroup ;
sh:maxCount 1 ;
sh:nodeKind sh:Literal ;
sh:order 0 ;
sh:path myschema:my_attr ;
sh:pattern "/^[a-z ,.'-]+$/i" ],
[ dash:editor dash:DateTimePickerEditor ;
sh:datatype <xsd:dateTime> ;
sh:group dlco:NamePropertyGroup ;
sh:maxCount 1 ;
sh:nodeKind sh:Literal ;
sh:order 2 ;
sh:path myschema:mydatetime ],
[ dash:editor dash:DatePickerEditor ;
sh:datatype <xsd:date> ;
sh:group dlco:NamePropertyGroup ;
sh:maxCount 1 ;
sh:nodeKind sh:Literal ;
sh:order 1 ;
sh:path myschema:mydate ] ;
sh:targetClass myschema:MyClass .
TODO:
dash:singleLine false, true ;
in the output above). Should this be accepted? or one always prioritised? or user-specified)sh:PropertyGroup
nodes into the SHACL output, if they indeed would first somehow be specified as part of the LinkML schema. The thing that still confuses me here is that these property groups are actually all data items, and not separate schemas per se, since they have the same structure, e.g.:dlco:BasicPropertyGroup a sh:PropertyGroup ;
rdfs:label "Basic" ;
sh:order "0"^^xsd:decimal ;
rdfs:comment "" .
dlco:DataPropertyGroup a sh:PropertyGroup ;
rdfs:label "Data" ;
sh:order "1"^^xsd:decimal ;
rdfs:comment "" .
Noting the definitions of mappings
for the case when we need those also as part of generated shapes: https://linkml.io/linkml-model/latest/docs/mappings/
Update:
required
property yet), but the important parts are there: sh:group
and sh:order
.I think this pretty much provides all the pieces necessary to address this issue.
For reference:
We could start with a schema that we have authored in LinkML, from
datalad-concepts
, and then remove unnecessary complexities (at first) and add all the constraint annotations and dash annotations that we feel would be necessary for form or viewer generation.