datacite / schema

DataCite Metadata Schema Repository
https://schema.datacite.org
45 stars 16 forks source link

Provide "official" and automatically updated upon releases jsonschema serializations of the schema #149

Open yarikoptic opened 5 hours ago

yarikoptic commented 5 hours ago

In our software (https://github.com/dandi/dandi-schema/) we used a particular "version" of the https://github.com/datacite/schema/blob/master/source/json/kernel-4.3/datacite_4.3_schema.json file and later discovered that no further changes to that file are provided and as https://github.com/datacite/schema/tree/master/source/json#readme announced (BTW - thanks for adding that information) -- the resultant newer files moved to another 3rd party repository https://github.com/inveniosoftware/datacite/tree/master/datacite/schemas .

Unfortunately neither that README nor the commits which added that file describe the reasoning. And it is not obvious in the target repo how and when those json serializations are updated. And it seems that is not automated either since e.g. there is only 1 commit for https://github.com/inveniosoftware/datacite/commits/master/datacite/schemas/datacite-v4.5.json whenever there were a number of 4.5 patch releases:

❯ git tag | grep '^4.5'
4.5.0
4.5.1
4.5.2
4.5.3

I would appreciate if json serializations of the schema were

An example, for possibly ideas on how to orchestrate, of similar automation is e.g. our dandischema:

So I think it would be great if (and let me know if you need help in preparing such one) a CI workflow would have automatically exported new (updated MAJOR.MINOR) versions of jsonschema and have canonical single location under /datacite/ organization.

tmorrell commented 4 hours ago

Hi @yarikoptic! Speaking as one of the invenio datacite maintainers (and with no role at DataCite), I don't believe DataCite uses a jsonschema internally. The official schema is XML, and DataCite hasn't (as far as I know) expressed interest in maintaining a jsonschema.

In terms of https://github.com/inveniosoftware/datacite releases, it's an entirely volunteer driven process so releases come whenever folks have time. We usually only target major DataCite releases. If you look at the patch releases you mentioned https://github.com/datacite/schema/releases they are mostly about maintaining the schema examples and web site and aren't really changes to the schema itself. So there wouldn't be a need to have a corresponding release of https://github.com/inveniosoftware/datacite