marco-bolo / dataset-catalogue

The index for MBO datasets
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Create GitHub Action to validate JSON-LD files #3

Open pbuttigieg opened 10 months ago

pbuttigieg commented 10 months ago

@pieterprovoost @marc-portier would you know of any off-the-shelf validators we can deploy in a GitHub Action to make sure the JSON-LD/schema.org files MBO participants create are in good shape?

Something that essentially runs https://validator.schema.org/ on the collection as it emerges?

pieterprovoost commented 10 months ago

It looks like https://validator.schema.org/ is not available as a package or API, but I came across these candidates:

Can we assume that all JSON-LD documents will be added to this repository, or do we need to validate documents embedded in web pages etc as well?

marc-portier commented 10 months ago

my reflex on this would be to use standard python stuff for the lower levels like

and then don't be shy to manage our own shacl AP description for optimal control of:

with that in place we can simply slam in RDFlib/pyshacl to do the validation

in fact, it also keeps things standard enough so other than py implementations for this workflow could be considered at any time (less lock in ?)

with respect to getting hold of the rdf - per question of @pieterprovoost:

pieterprovoost commented 10 months ago

I have created a minimal proof of concept for validating JSON-LD documents hosted in this repository.

marc-portier commented 10 months ago

cool work

would be great to add explicit sh:message and sh:sevirity -- getting those from the shacl validation-result into the generated page could help us guide people to what they actually need to change ... --> but maybe something to address in scope of #4

pieterprovoost commented 10 months ago

Results are presented in tabular format now, and presentation can be further improved once we get some more extensive / realistic validation results.