ebi-ait / checklist

Template repository for checklists
Apache License 2.0
1 stars 0 forks source link

error messages from json schema validation are vague #37

Closed amnonkhen closed 1 week ago

amnonkhen commented 4 weeks ago

Usable error messages means that the problematic document attribute is clear and the the problem is also clear. If not, the conversion process needs to be improved.

In the ENA json schema we used arrays, which made the errors difficult to understand. This issue might be less critical when using theBioSamples format which does not have arrays.

tasks:

amnonkhen commented 2 weeks ago

Wkt8 commented 2 weeks ago

Collected a list of error messages from using Biovalidator and BioSamples schemas BioSamples Error Messages for Invalid:

Mandatory Field

characteristics/geographic location (country and/or sea)
must have required property 'geographic location (country and/or sea)'

Invalid Data Types (basically every data type I found was a string), so I couldn't really test this

Invalid Enum Value

characteristics/environmental_sample/0/text
must be equal to one of the allowed values: ["No","Yes"]

Invalid Pattern

> - characteristics/collection date/0/text
> must match pattern "(^[12][0-9]{3}(-(0[1-9]|1[0-2])(-(0[1-9]|[12][0-9]|3[01])(T[0-9]{2}:[0-9]{2}(:[0-9]{2})?Z?([+-][0-9]{1,2})?)?)?)?(/[0-9]{4}(-[0-9]{2}(-[0-9]{2}(T[0-9]{2}:[0-9]{2}(:[0-9]{2})?Z?([+-][0-9]{1,2})?)?)?)?)?$)|(^not collected$)|(^not provided$)|(^restricted access$)|(^missing: control sample$)|(^missing: sample group$)|(^missing: synthetic construct$)|(^missing: lab stock$)|(^missing: third party data$)|(^missing: data agreement established pre-2023$)|(^missing: endangered species$)|(^missing: human-identifiable$)"

Invalid JSON (e.g. deleting commas, messing with brackets or removing parent fields)

Unable to parse the JSON in the input fields.
Wkt8 commented 2 weeks ago

The Invalid JSON message could be improved by adding line numbers

amnonkhen commented 1 week ago

Need to reevaluate error messages after fixing the conversion to include field aliases (ticket #42).

theisuru commented 1 week ago

New JSON Schema is here. Use *-BSD.json files for BioSamples Schema.

theisuru commented 1 week ago

We can add new invalid samples in this directory. We will add them to unit/integration tests later.

Wkt8 commented 1 week ago
Wkt8 commented 1 week ago

Added new invalid samples in this directory that Isuru shared. Three files are there, corresponding to errors for 'mandatory field', 'enum' and 'pattern'.

Data-type is unable to be tested as all of data types were string type. JSON error is as above, broad and lacking information when tested in biovalidator.

Unable to parse the JSON in the input fields.

Wkt8 commented 1 week ago

Completed: see here @amnonkhen if this is what you are looking for, it is very similar to the before task. https://github.com/ebi-ait/checklist-converter/blob/ck-37-wei-add-specified-errors-to-sample/src/test/resources/samples/ERC000011/error_messages.md