CLARIAH / clariah-plus

This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLARIAH-PLUS in a central place and should facilitate findability, transparency and project planning, for the project as a whole.
9 stars 6 forks source link

Implement codemeta validation component for tool discovery, against clariah requirements #50

Closed proycon closed 1 year ago

proycon commented 2 years ago

The following was already suggested by @ddeboer in #32.

Create a SHACL shape to validate CLARIAH-specific codemeta.json files.

This sounds good to me and would be a valuable component that can be invoked from the harvester (#33). @ddeboer Is this something you'd want to take up implementing? (I don't have prior expertise with SHACL myself so having that would be a plus).

ddeboer commented 2 years ago

Sure! Do we want the SHACL in this repository or separately? And if the latter, where?

I suggest to combine this issue with #32: we can use SHACL for both the ontology definition and the validation rules. This is more efficient and maintainable than adding OWL just for the ontology. For more on this approach, see https://www.topquadrant.com/shacl-blog/. An example is the NDE Dataset requirements, which use SHACL only.

proycon commented 2 years ago

Sure! Do we want the SHACL in this repository or separately? And if the latter, where?

In a separate one yes, perhaps we can use CLARIAH/tool-discovery for that?

I suggest to combine this issue with #32: we can use SHACL for both the ontology definition and the validation rules. This is more efficient and maintainable than adding OWL just for the ontology. For more on this approach, see https://www.topquadrant.com/shacl-blog/. An example is the NDE Dataset requirements, which use SHACL only.

This validation issue is indeed very closely related to #32, so combining them as much as possible would make sense. The approach you propose looks interesting.

As I mentioned earlier; #32 has two aspects: 1) the CLARIAH-specific vocabulary and 2) the generic vocabulary that is currently missing from codemeta/schema.org but we need for expressing certain aspects of metadata. The latter issue is being discussed in codemeta/codemeta#271 and we seem on the verge of forming a small 'task force' for that in a separate repository (wider than CLARIAH) with the intention to eventually merge it into codemeta if the community there agrees. You're welcome to join in there too of course so we can align all these components properly.

proycon commented 2 years ago

@ddeboer By the way, there is CLARIAH funding available for this for you (ref also https://github.com/CLARIAH/tool-discovery/pull/2) . I had initially reserved 68 hours (might be too conservative?) via NDE for this in the planning. You (or Enno?) might want to contact @tvermaut about arranging the bureaucracy-aspect of this (as I don't know much about that).