frictionlessdata / frictionless-py

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
https://framework.frictionlessdata.io
MIT License
700 stars 148 forks source link

Unexpected field-error for a boolean "example" with "trueValues" or "falseValues" properties #1610

Closed amelie-rondot closed 3 weeks ago

amelie-rondot commented 9 months ago

Overview

Using frictionless validate I got this unvalid report of validation:

frictionless validate data.csv --schema schema.json
──────────────────────────────────────────────────────────────────────────────────────────────────────────── Dataset ─────────────────────────────────────────────────────────────────────────────────────────────────────────────
               dataset               
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ type  ┃ path     ┃ status  ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━┩
│ data │ table │ data.csv │ INVALID │
└──────┴───────┴──────────┴─────────┘
───────────────────────────────────────────────────────────────────────────────────────────────────────────── Tables ─────────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                 data                                                  
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row  ┃ Field ┃ Type        ┃ Message                                                                ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ None │ None  │ field-error │ Field is not valid: example value "no" for field "IsTrue" is not valid │
└──────┴───────┴─────────────┴────────────────────────────────────────────────────────────────────────┘

The error is related to the value given in the example property of the field 'IsTrue' of type boolean in the schema used for the validation, which is not valid.

I was expected a valid report of validation: I was not expected an field-error on this value example which matches with the falseValues property value, as mentionned in the boolean type of TableSchema documentation.

Resources used to reproduce the problem

Note

Replacing in the schema field 'IsTrue' the optionnal property example with 'false' instead of 'no' solves the problem. The validation report is now valid with this new schema:

frictionless validate data.csv --schema schema.json
──────────────────────────────────────────────────────────────────────────────────────────────────────────── Dataset ─────────────────────────────────────────────────────────────────────────────────────────────────────────────
              dataset               
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓
┃ name ┃ type  ┃ path     ┃ status ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩
│ data │ table │ data.csv │ VALID  │
└──────┴───────┴──────────┴────────┘

As specified in the TableSchema boolean type documentation : "The boolean field can be customised with these additional properties: trueValues: [ "true", "True", "TRUE", "1" ] falseValues: [ "false", "False", "FALSE", "0" ] format: no options (other than the default)."

But, in this case the falseValues property seems to not be applied on the example property by frictionless and retruns an unexpected field-error in the validation report.

amelie-rondot commented 9 months ago

For more context, frictionless validate is used in Validata.fr project. We want to upgrade frictionless-py to v5 in this project. But if we do that now, the field-error exposed in this issue will block many schemas used in this project.