crs4 / rocrate-validator

A Python package to validate RO-Crates
Apache License 2.0
7 stars 7 forks source link

Validation should fail if required properties are not present on Root Data Entity #14

Closed elichad closed 1 month ago

elichad commented 2 months ago

Tested on version 0.2.1.

From Direct properties of the Root Data Entity in RO-Crate 1.1:

The Root Data Entity MUST have the following properties:

  • @type: MUST be Dataset
  • @id: MUST end with / and SHOULD be the string ./
  • name: SHOULD identify the dataset to humans well enough to disambiguate it from other RO-Crates
  • description: SHOULD further elaborate on the name to provide a summary of the context in which the dataset is important.
  • datePublished: MUST be a string in ISO 8601 date format and SHOULD be specified to at least the precision of a day, MAY be a timestamp down to the millisecond.
  • license: SHOULD link to a Contextual Entity in the RO-Crate Metadata File with a name and description. MAY have a URI (eg for Creative Commons or Open Source licenses). MAY, if necessary be a textual description of how the RO-Crate may be used.

Currently, the validation passes even when the last four properties are not present. Only @type and @id cause failures if they are missing or incorrect. Instead, the validation should fail if any of the required properties are missing.

Minimal example of crate that reproduces the issue:

{
  "@context": [
    "https://w3id.org/ro/crate/1.1/context"
  ],
  "@graph": [
    {
      "@id": "ro-crate-metadata.json",
      "@type": "CreativeWork",
      "conformsTo": {
        "@id": "https://w3id.org/ro/crate/1.1"
      },
      "about": {
        "@id": "./"
      }
    },
    {
      "@id": "./",
      "@type": "Dataset"
    }
  ]
}
ilveroluca commented 2 months ago

By default, requirement severity is set to REQUIRED -- i.e., MUST. If you want the "SHOULD" requirements to be tested you need to specify --requirement-severity RECOMMENDED.

Maybe the severity on the presence of datePublished needs to be raised to REQUIRED:

datePublished: MUST be a string in ISO 8601 date format ...

elichad commented 2 months ago

Yes, but as I understand the first line in the quote, the properties MUST all be present, regardless of whether they follow the SHOULD requirements for each item

kikkomep commented 1 month ago

Hi @elichad. I integrated some changes into the develop branch that make the properties you mentioned REQUIRED, so now the validation should fail if they are not present.

Let us know how it’s going

elichad commented 1 month ago

Just tested the changes - I left one comment about the message for datePublished https://github.com/crs4/rocrate-validator/pull/16/files#r1795409365 but aside from that all good!

kikkomep commented 1 month ago

ok, fixed in https://github.com/crs4/rocrate-validator/commit/32265ed48687819bbb2eaa5d548032d099b385ad