softwarepub / hermes

Implementation of the HERMES workflow
https://docs.software-metadata.pub
Other
19 stars 5 forks source link

Enable "quality assessment" for metadata #69

Open sdruskat opened 1 year ago

sdruskat commented 1 year ago

It should be possible to assess the quality of the metadata against a specific (configurable?) standard of valid metadata.

In this step, things like adherence to a sensible default should be asserted.

Example: Asserting that the metadata includes authorship information, although the publication repository may not require it.

This doesn't need to be tied to a single step, but is rather a configuration that can be called by either or both validate and curate.

sdruskat commented 1 year ago

Edit overriding out-of-date notes.

Notes from the discussion today:

- Collation only unifies the data, i.e., removes clear duplicates, flags bad data types, etc. - ~~Validation Curation and/or the validate step in Collation then looks at the unified model and raises semantic conflicts, e.g., similar author names across two people data points that could or could not be the same person.~~

This has impact on implementation, so we need to check if we already breached this separation, or if we are fine and can follow up with implementation.

Should also be recorded in an ADR.

sdruskat commented 1 year ago

Implementation could be done in stages:

  1. Dummy configuration
  2. Manual configuration based on fields
  3. Targeting schemas (could be schemas that we define down the road, e.g., based on software citation principles and others)