CONP-PCNO / schema

DATS JSON schemas
https://datatagsuite.github.io/docs/html/dats.html
Other
0 stars 4 forks source link

check whether we are using the identifier schema correctly #8

Open surchs opened 3 years ago

surchs commented 3 years ago

The DATS identifier schema has two main fields:

"identifier": {
  "description": "A code uniquely identifying an entity locally to a system or globally.",
  "type" : "string"
},
"identifierSource": {
  "description": "The identifier source represents information about the organisation/namespace responsible for minting the identifiers. It must be provided if the identifier is provided.",
  "type" : "string"
  }

Currently, we are using "identifier" to store the non-dereferencable string name of an identifier and "identifierSource" to store a dereferencable IRI that points to the identifier. Here is an example.

I believe that the original intent of DATS was to do the opposite, i.e. to

Apart from the description in the schema, this also fits with the fact that "identifier" used to require a URI format (see datatagsuite/schema@f02264de4dc6bf879aa374c5981d4e4003942f2d).

We should discuss this and make an issue in the conp-dataset repo

cmadjar commented 3 years ago

Based on today's discussion at the CONP dev call, we move the URL of the species taxonomy to the identifier field instead of the identifierSource field.

@cmadjar will send a PR that modifies the validator to check that the URL is provided in identifier and not identifierSource so we can get a list of affected datasets.

@emmetaobrien will go through the list of datasets to modify and will modify the datasets that we can modify ourselves.

cmadjar commented 3 years ago

@surchs I created a new issue in conp-dataset. Would you mind taking a look at the issue and see if I forgot something?

https://github.com/CONP-PCNO/conp-dataset/issues/712