NoelDeMartin / umai

Offline-first Recipes Manager
https://umai.noeldemartin.com
GNU General Public License v3.0
33 stars 5 forks source link

Shape tree for Umai #22

Open josephguillaume opened 2 months ago

josephguillaume commented 2 months ago

I'm opening this issue to discuss definition of a shape tree for Umai.

This is related to https://github.com/NoelDeMartin/umai/issues/19 and the use cases defined there, but at this stage the purpose of this issue is just to use the Umai data model as a test case for Solid interoperability.

As a point of reference for discussion, the Soukai data model is here: https://github.com/NoelDeMartin/umai/tree/main/src/models And the crdt ontology used by Umai is here: https://github.com/NoelDeMartin/vocab/blob/main/resources/ontologies/crdt.ttl

josephguillaume commented 2 months ago

As a starting point, here's a first attempt.

@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <https://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix crdt: <https://vocab.noeldemartin.com/crdt/>.
@prefix st: <http://www.w3.org/ns/shapetrees#>.
@prefix purl: <http://purl.org/dc/terms/>.
@prefix : <#>.

:UmaiCookbookShape a st:ShapeTree;
st:expectsType st:Container ;
st:contains :UmaiResourceShape.

:UmaiResourceShape a st:ShapeTree;
st:expectsType st:Resource;
#FIXME what we need to express is that the resource uses these four shapes
st:shape :RecipeShape, :HowToStepShape, :CrdtShapeMetadata, :CrdtShapeOperation.

:RecipeShape a sh:NodeShape;
sh:targetClass schema:Recipe;
rdfs:label "Shape for all schema:Recipe";
sh:property [
  rdfa:label "name is mandatory and is a single string";
  sh:path schema:name;
  sh:datatype xsd:string;
  sh:minCount 1;
  sh:maxCount 1
],
[
  rdfs:label "any description is a unique string";
  sh:path schema:description;
  sh:datatype xsd:string;
  sh:maxCount 1
],
[
  rdfs:label "images are IRIs";
  sh:path schema:image;
  sh:nodeKind sh:IRI
],
[
  rdfs:label "any recipeYield is a unique string";
  sh:path schema:recipeYield,
  sh:datatype xsd:string;
  sh:maxCount 1
],
[
  rdfs:label "any prepTime is a unique string";
  sh:path schema:prepTime;
  sh:datatype xsd:string;
  sh:maxCount 1
],
[
  rdfs:label "any cookTime should be a unique string";
  sh:path schema:cookTime;
  sh:datatype xsd:string;
  sh:maxCount 1
],
[
  rdfs:label "recipeIngredients are strings and not ordered";
  sh:path schema:recipeIngredient;
  sh:datatype xsd:string
],
[
  rdfs:label "any recipe instructions are a HowToStep";
  sh:path schema:recipeInstructions;
  sh:nodeKind sh:IRI;
  sh:class schema:HowToStep;
  sh:node :HowToStepShape
],
[
  rdfs:label "any lists that the recipe isReferencedBy are RecipesLists ";
  sh:path purl:isReferencedBy;
  sh:node :RecipesListShape
],
[
  rdfs:label "sameAs external links are IRIs";
  sh:path schema:sameAs;
  sh:nodeKind sh:IRI
].

:HowToStepShape a sh:NodeShape;
sh:targetClass schema:HowToStep;
rdfs:label "Shape for all schema:HowToStep";
sh:property
[
  rdfs:label "text is mandatory and a unique string";
  sh:path schema:text;
  sh:datatype xsd:string;
  sh:minCount 1;
  sh:maxCount 1
],
[
  rdfs:label "position is mandatory and a unique integer";
  sh:path schema:position;
  sh:datatype xsd:integer;
  sh:minCount 1;
  sh:maxCount 1
].

:RecipesListShape a sh:NodeShape;
rdfs:label "Shape for a RecipesList";
sh:class schema:ItemList;
sh:property [
  rdfs:label "any name is a unique string";
  sh:path schema:name;
  sh:datatype xsd:string;
  sh:maxCount 1
 ],
[
  rdfs:label "any description is a unique string";
  sh:path schema:description;
  sh:datatype xsd:string;
  sh:maxCount 1
],
[
  rdfs:label "any creator is a IRI";
  sh:path purl:creator;
  sh:nodeKind sh:IRI
],
[
  rdfs:label "any itemListElements are recipe list items";
  sh:path schema:itemListElement;
  sh:node :RecipesListItemShape
].

:RecipesListItemShape a sh:NodeShape;
rdfs:label "Shape for a Recipes List Item";
sh:class schema:ListItem;
sh:property [
  rdfs:label "item is mandatory and is a single Recipe";
  sh:path schema:item;
  sh:node :RecipeShape;
  sh:minCount 1;
  sh:maxCount 1
].

# Metadata shape also applies to Tombstone
crdt:Tombstone rdfs:subClassOf crdt:Metadata.

:CrdtShapeMetadata a sh:NodeShape;
sh:targetClass crdt:Metadata;
rdfs:label "Shape for all crdt:Metadata";
sh:property 
[
  rdfs:label "resource is mandatory and is a single IRI";
  sh:path crdt:resource;
  # TODO should actually be an IRI defined in the same data graph
  sh:nodeKind sh:IRI;
  sh:minCount 1;
  sh:maxCount 1
],
[
  rdfs:label "any createdAt is a unique dateTime";
  sh:path crdt:createdAt;
  sh:datatype xsd:dateTime;
  sh:maxCount 1
],
[
  rdfs:label "any updatedAt is a unique dateTime";
  sh:path crdt:updatedAt;
  sh:datatype xsd:dateTime;
  sh:maxCount 1
],
[
  rdfs:label "any deletedAt is a unique dateTime";
  sh:path crdt:deletedAt;
  sh:datatype xsd:dateTime;
  sh:maxCount 1
]

# TODO: crdt:property and crdt:value could be constrained based on the properties of  :RecipeShape and :HowToStepShape (while recognising that those shapes are not closed)
# The shapes below will be used for all these subclasses
crdt:DeleteOperation rdfs:subClassOf crdt:Operation.
crdt:AddPropertyOperation rdfs:subClassOf crdt:PropertyOperationWithValue.
crdt:RemovePropertyOperation rdfs:subClassOf crdt:PropertyOperationWithValue.
crdt:SetPropertyOperation rdfs:subClassOf crdt:PropertyOperationWithValue.
crdt:UnsetPropertyOperation rdfs:subClassOf crdt:PropertyOperation.

:CrdtOperationShape a sh:NodeShape;
rdfs:label "Shape for all crdt:Operation";
sh:targetClass crdt:Operation;
sh:property [
  rdfs:label "date is mandatory and is a unique dateTime";
  sh:path crdt:date;
  sh:datatype xsd:dateTime;
  sh:minCount 1;
  sh:maxCount 1
],
[
  rdfs:label "resource is mandatory and is a unique IRI";
  sh:path crdt:resource;
  # TODO should actually be an IRI defined in the same data graph
  sh:nodeKind sh:IRI;
  sh:minCount 1;
  sh:maxCount 1
].

:CrdtPropertyOperationShape a sh:NodeShape;
rdfs:label "Shape for all crdt:PropertyOperation";
sh:targetClass crdt:PropertyOperation;
sh:node :CrdtOperationShape;
sh:property [
  rdfs:label "property is mandatory and a single IRI";
  sh:path crdt:property;
  sh:nodeKind sh:IRI;
  sh:minCount 1;
  sh:maxCount 1
].

:CrdtPropertyOperationWithValueShape a sh:NodeShape;
rdfs:label "Shape for all crdt:PropertyOperation with value";
sh:targetClass crdt:PropertyOperationWithValue;
sh:node :CrdtPropertyOperationShape;
sh:property [
  rdfs:label "value is mandatory and anything except a blank node";
  sh:path crdt:value;
  # Blank nodes not currently supported by Soukai
  # Not yet clear whether they would be needed here
  sh:nodeKind sh:IRIOrLiteral;
  sh:minCount 1
].

Edited:

josephguillaume commented 2 months ago

In my opinion, the shapes/shape tree should also be versioned to support schema migration over time. It might be sufficient to have the shape tree defined in a document with a version number in its name?

NoelDeMartin commented 2 months ago

Hey, thanks for working on this, I'm very excited to see how we keep improving the interoperability story in Solid :).

Overall it looks fine, although I've never worked with shapes so I may be missing something important. Here's some initial feedback on what we have so far.

I'm pretty sure I could generate this file from the Soukai model definitions, it doesn't seem too complicated. That could also be used to hook into the build script, and publish these files alongside the application assets. That way, they would be versioned automatically, because each release would already include the updated shapes. I wonder if the file itself should contain some versioning information, something like :RecipeShape sh:version "0.5.1". The only problem is that I don't usually keep old versions of the app online... Maybe I could host them on something like 0.5.1.umai.noeldemartin.com? I played with the idea in the past, but I'm not sure I like it because it complicates the deployment a bit (I'm using GitHub Pages at the moment, and it doesn't support that kind of thing).

Looking at the shapes, I found a couple of things to change (mentioned below), but the only thing I'm not sure about is the sh:closed true; for operations and metadata. Is there any advantage on doing that? Why not let anyone augment those things as well? I have many ideas about improvements in the future, such as adding metadata of the device where the change happened, or the author in case of collaborative apps, etc. Does that mean that in data created with sh:closed it wouldn't be possible to add more information in the future?

These are some things that aren't right in the shape:

josephguillaume commented 2 months ago

All sounds good.

I'd included sh:closed true for operations and metadata with the idea that it might be useful to maintain integrity of the data, given that history is in principle immutable. Your potential future additions make a lot of sense. I agree sh:closed might be counterproductive.

The bit I'm least sure about is

:UmaiResourceShape a st:ShapeTree;
st:expectsType st:Resource;
st:shape :RecipeShape, :HowToStepShape, :CrdtShapeMetadata, :CrdtShapeOperation.

The intent is to express that Umai expects a single document that should validate against all four shapes, but I'm not sure this how it would be interpreted by the the shape tree spec and others that depend on it. I'm hoping that having a concrete example of Umai's data structure supports further discussion of this.

In principle the user might also want the crdt data to be stored in a separate document and I'm also not sure how this could be expressed. Umai (and other apps) are of course free to mandate that the data should be in the same document.

NoelDeMartin commented 2 months ago

The intent is to express that Umai expects a single document that should validate against all four shapes

With this do you mean that it has to contain all shapes? Or that it can?

I'm asking because the initial version of a recipe doesn't have any operations, they're only added after the first edition (although that may be a pointless optimization, so I don't care about it too much). But more importantly, recipe instructions (HowToStep) are optional, there can be valid recipes without them.

I'm hoping that having a concrete example of Umai's data structure supports further discussion of this.

Do you need more examples? Let me know what's missing and I can provide some more.

In principle the user might also want the crdt data to be stored in a separate document and I'm also not sure how this could be expressed. Umai (and other apps) are of course free to mandate that the data should be in the same document.

Yeah, that's a complicated topic. Soukai (the library that powers Umai) already supports loading data from different documents, but in the case of operations that's very complicated because it's the operations that point to the resource. So it wouldn't know which documents to fetch. In the case where related data is referenced in the recipe (for example, the recipe instructions), it should be possible to load them as well. But I don't think Umai is taking advantage of that feature at the moment.

Edit: I just realized we could just create a new property as the inverse for crdt:resource, but I'm not too inclined to look into that because it complicates things a lot and I don't see any advantage in practice.

josephguillaume commented 3 weeks ago

One thing I'm not sure about - does usingSameDocument only apply to new documents, or does Soukai-Solid require that the related model be found in the same document? i.e. does this need to be captured by the SHACL shape?

NoelDeMartin commented 3 weeks ago

One thing I'm not sure about - does usingSameDocument only apply to new documents, or does Soukai-Solid require that the related model be found in the same document?

It only applies to new documents, but depending on how the app is implemented it may not work properly.

Long story short, Soukai models have relationships; and related documents stored in the same document will be loaded automatically. But if there is any related model in a different document, it won't be loaded unless the relationship is loaded explicitly.

In Umai's case, a Recipe has a relationship with Recipe Instructions (a recipe belongs to many recipe instructions). I haven't tested this extensively, but if there is some case where an instruction in a different document is not loaded properly; I would consider that a bug.

i.e. does this need to be captured by the SHACL shape?

Keeping in mind what I explained, I don't think it should.