Open pbuttigieg opened 1 year ago
It looks like https://validator.schema.org/ is not available as a package or API, but I came across these candidates:
Can we assume that all JSON-LD documents will be added to this repository, or do we need to validate documents embedded in web pages etc as well?
my reflex on this would be to use standard python stuff for the lower levels like
json.tool
to verify the low level syntax (as I suspect issues will be there already)RDFLib.parse
to check if things are actually representing a workable knowledge graphand then don't be shy to manage our own shacl AP description for optimal control of:
with that in place we can simply slam in RDFlib/pyshacl
to do the validation
in fact, it also keeps things standard enough so other than py implementations for this workflow could be considered at any time (less lock in ?)
with respect to getting hold of the rdf - per question of @pieterprovoost:
I have created a minimal proof of concept for validating JSON-LD documents hosted in this repository.
scripts
and shacl
subfolders.push
, a script checks all JSON-LD documents in the datasets folder and performs JSON and SHACL validation. Results from the validation are added to the Jekyll website in a separate branch reports
.reports
branch, the Jekyll website with validation reports gets deployed at https://lab.marcobolo-project.eu/dataset-catalogue/.cool work
would be great to add explicit sh:message
and sh:sevirity
-- getting those from the shacl validation-result into the generated page could help us guide people to what they actually need to change ... --> but maybe something to address in scope of #4
Results are presented in tabular format now, and presentation can be further improved once we get some more extensive / realistic validation results.
Is this issue to be closed?
@pieterprovoost @marc-portier would you know of any off-the-shelf validators we can deploy in a GitHub Action to make sure the JSON-LD/schema.org files MBO participants create are in good shape?
Something that essentially runs https://validator.schema.org/ on the collection as it emerges?