The issue https://github.com/psychoinformatics-de/shacl-vue/issues/32 in shacl-vue brought to light that data converted from YAML to TTL format using the current state of thesdd schema (which inherits from distribution, thing, and more in dlco) does not contain the expected type designations.
Note that the output does not contain the expected
<https://example.org/ns/dataset/#ahorst> a "dldist:Person"^^xsd:anyURI ;
which is the problem.
Discussion
The dlthing:meta_type slot was implemented in order to allow validation of data according to a specialized schema (indicated by the meta_type) where the range of the property accepting the data object is actually a super-class of the specialized one. (I couldn't find a more intuitive way of stating this....)
For example, let's say a Distribution has a was_attributed_to field (aka property) with range/type dlco:Agent, while dlco:Agent has multiple subclasses such as dlco:Person or dlco:Organization. This means the data object can pass through a dlco:Person or dlco:Organization and it should pass LinkML validation, as long as these are specified in the meta_type field of the data object and as long as these are actually subclasses of the accepting slot's range class.
However, the dlthing:meta_type specification does not really have meaning outside of the process of LinkML-based data validation. E.g. when data is exported to TTL and then used by shacl-vue, such an application is interested in the nodes and their types, such as the currently missing:
<https://example.org/ns/dataset/#ahorst> a "dldist:Person"^^xsd:anyURI ;
It is only when data generated/updated by an application such as shacl-vue wants to be validated in LinkML against the dlco-based schemas that the meta_type becomes important again.
After discussions with @mih, several points were raised:
try setting rdf:type as the slot_uri of dlthing:meta_type (instead of the current dlthing:meta_type)
from the context of importing data to LinkML (for validation) and exporting from LinkML (for use in the world) rdf:type and dlthing:meta_type are essentially two-way aliases:
on export, rdf:type should be dlthing:meta_type, since rdf:type is the meaningful type designator in the real world
on import, dlthing:meta_type should be rdf:type, since the specific type (not a superclass) is what validation should be based on
could we ditch dlthing:meta_type, and only use type (with slot_uri: rdf:type and designates_type: true)? Probably not because something about LinkML slots with designates_type: true only being able to accept LinkML classes/types as the range (i.e. a class/type defined in the validation schema)? @mih please correct me here if I'm misrepresenting.
Investigating rdf:type as slot_uri of dlthing:meta_type
diff --git a/src/thing/unreleased.yaml b/src/thing/unreleased.yaml
index 486fbb2..d010834 100644
--- a/src/thing/unreleased.yaml
+++ b/src/thing/unreleased.yaml
@@ -190,7 +190,7 @@ slots:
range: string
meta_type:
- slot_uri: dlthing:meta_type
+ slot_uri: rdf:type
designates_type: true
description: >-
Type designator of a metadata object for validation and schema structure
changing nothing in the data (i.e. the data object still specifies the meta_type field, and not the type field), and then running the conversion code again.
The issue https://github.com/psychoinformatics-de/shacl-vue/issues/32 in
shacl-vue
brought to light that data converted from YAML to TTL format using the current state of thesdd
schema (which inherits fromdistribution
,thing
, and more indlco
) does not contain the expected type designations.Demonstrative example
With
linkml 1.8.1
:The
thing
schema shows formeta_type
andtype
:https://github.com/psychoinformatics-de/datalad-concepts/blob/b2fcae84bd2b6062701fac9b95946f60c8bd4365/src/thing/unreleased.yaml#L192-L209
and the input data shows the following for one of the authors, note that there is no
type
specified:https://github.com/psychoinformatics-de/datalad-concepts/blob/b2fcae84bd2b6062701fac9b95946f60c8bd4365/src/sdd/unreleased/examples/Distribution-penguins.yaml#L156-L168
We can then convert the YAML to TTL using:
The output for the same author after running
linkml-convert
is:Note that the output does not contain the expected
which is the problem.
Discussion
The
dlthing:meta_type
slot was implemented in order to allow validation of data according to a specialized schema (indicated by themeta_type
) where the range of the property accepting the data object is actually a super-class of the specialized one. (I couldn't find a more intuitive way of stating this....)For example, let's say a
Distribution
has awas_attributed_to
field (aka property) with range/typedlco:Agent
, whiledlco:Agent
has multiple subclasses such asdlco:Person
ordlco:Organization
. This means the data object can pass through adlco:Person
ordlco:Organization
and it should pass LinkML validation, as long as these are specified in themeta_type
field of the data object and as long as these are actually subclasses of the accepting slot's range class.However, the
dlthing:meta_type
specification does not really have meaning outside of the process of LinkML-based data validation. E.g. when data is exported to TTL and then used byshacl-vue
, such an application is interested in the nodes and their types, such as the currently missing:It is only when data generated/updated by an application such as
shacl-vue
wants to be validated in LinkML against the dlco-based schemas that themeta_type
becomes important again.After discussions with @mih, several points were raised:
rdf:type
as the slot_uri ofdlthing:meta_type
(instead of the currentdlthing:meta_type
)rdf:type
anddlthing:meta_type
are essentially two-way aliases:rdf:type
should bedlthing:meta_type
, sincerdf:type
is the meaningful type designator in the real worlddlthing:meta_type
should berdf:type
, since the specific type (not a superclass) is what validation should be based ondlthing:meta_type
, and only usetype
(withslot_uri: rdf:type
anddesignates_type: true
)? Probably not because something about LinkML slots withdesignates_type: true
only being able to accept LinkML classes/types as the range (i.e. a class/type defined in the validation schema)? @mih please correct me here if I'm misrepresenting.Investigating
rdf:type
as slot_uri ofdlthing:meta_type
I tried this by updating the line: https://github.com/psychoinformatics-de/datalad-concepts/blob/b2fcae84bd2b6062701fac9b95946f60c8bd4365/src/thing/unreleased.yaml#L193
changing nothing in the data (i.e. the data object still specifies the
meta_type
field, and not thetype
field), and then running the conversion code again.This was the output:
The difference compared to the initial output:
rdf:type
(i.e.a
) is now included!dlthing:meta_type "dldist:Person"^^xsd:anyURI ;
is not in the output anymore (this is in fact removed completely from all output data)I also ran checks and validations locally after the change, with no unexpected errors.
Is this what we want?