pkiraly / qa-catalogue

QA catalogue – a metadata quality assessment tool for library catalogue records (MARC, PICA)
GNU General Public License v3.0
80 stars 17 forks source link

Align validation error types/options with Avram #383

Open nichtich opened 11 months ago

nichtich commented 11 months ago

The Avram specification contains a list of validation rules to check (AR1...AR24, but only AR1..AR16 are mandatory) and a list of validation options to enable/disable selected error types. These roughly correspond to QA catalogue validation error types. I'd like to align the error type codes used in QA catalogue and Avram specification. The QA catalogue error type codes with correspondence in Avram are:

code category Avram rule Avram option
undefinedField DATAFIELD AR2 ignore_unknown_fields
nonrepeatableField DATAFIELD AR3 ignore_nonrepeatable_fields (TBD)
undefinedSubfield SUBFIELD AR7 ignore_unknown_subfields
nonrepeatableSubfield SUBFIELD AR8 ignore_nonrepeatable_subfields (TBD)
missingSubfield DATAFIELD AR9 ignore_missing_subfields (TBD)
nonEmptyIndicator INDICATOR AR11 ignore_indicators (TBD)
hasInvalidValue INDICATOR AR11 ignore_indicators (TBD)
patternMismatch SUBFIELD AR12 ignore_patterns (TBD)
hasInvalidValue CONTROLFIELD AR12-AR13 ignore_values
hasInvalidValue SUBFIELD AR12-AR14 ignore_values
controlValueContainsInvalidCode CONTROLFIELD AR14 ignore_codes
invalidReference SUBFIELD AR14 ignore_codes

Avram further has AR4 (missigField), the other Avram Rules seem to be already covered by QA Catalogue.

Questions:

nichtich commented 10 months ago

I've released Avram specification 0.9.3 with validation options renamed to better match error types/options of QA Catalogue. Differences to be resolved still:

In addition there are validation options not supported by QA Catalogue yet, but this is optional unless full support of arbitrary Avram Schemas is promised.