Closed amnonkhen closed 1 week ago
Collected a list of error messages from using Biovalidator and BioSamples schemas BioSamples Error Messages for Invalid:
Mandatory Field
characteristics/geographic location (country and/or sea)
must have required property 'geographic location (country and/or sea)'
Invalid Data Types (basically every data type I found was a string), so I couldn't really test this
Invalid Enum Value
characteristics/environmental_sample/0/text
must be equal to one of the allowed values: ["No","Yes"]
Invalid Pattern
> - characteristics/collection date/0/text
> must match pattern "(^[12][0-9]{3}(-(0[1-9]|1[0-2])(-(0[1-9]|[12][0-9]|3[01])(T[0-9]{2}:[0-9]{2}(:[0-9]{2})?Z?([+-][0-9]{1,2})?)?)?)?(/[0-9]{4}(-[0-9]{2}(-[0-9]{2}(T[0-9]{2}:[0-9]{2}(:[0-9]{2})?Z?([+-][0-9]{1,2})?)?)?)?)?$)|(^not collected$)|(^not provided$)|(^restricted access$)|(^missing: control sample$)|(^missing: sample group$)|(^missing: synthetic construct$)|(^missing: lab stock$)|(^missing: third party data$)|(^missing: data agreement established pre-2023$)|(^missing: endangered species$)|(^missing: human-identifiable$)"
Invalid JSON (e.g. deleting commas, messing with brackets or removing parent fields)
Unable to parse the JSON in the input fields.
The Invalid JSON message could be improved by adding line numbers
Need to reevaluate error messages after fixing the conversion to include field aliases (ticket #42).
We can add new invalid samples in this directory. We will add them to unit/integration tests later.
Added new invalid samples in this directory that Isuru shared. Three files are there, corresponding to errors for 'mandatory field', 'enum' and 'pattern'.
Data-type is unable to be tested as all of data types were string type. JSON error is as above, broad and lacking information when tested in biovalidator.
Unable to parse the JSON in the input fields.
Completed: see here @amnonkhen if this is what you are looking for, it is very similar to the before task. https://github.com/ebi-ait/checklist-converter/blob/ck-37-wei-add-specified-errors-to-sample/src/test/resources/samples/ERC000011/error_messages.md
Usable error messages means that the problematic document attribute is clear and the the problem is also clear. If not, the conversion process needs to be improved.
In the ENA json schema we used arrays, which made the errors difficult to understand. This issue might be less critical when using theBioSamples format which does not have arrays.
tasks: