frictionlessdata / frictionless-py

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
https://framework.frictionlessdata.io
MIT License
700 stars 148 forks source link

pattern-constraint violation on field with primaryKey raises CastError #300

Closed cbenz closed 5 years ago

cbenz commented 5 years ago

Original bug report: https://git.opendatafrance.net/validata/validata-core/issues/7

Trying to validate this file against this schema I get a stack trace ending on a CastError:

$ goodtables validate --schema https://raw.githubusercontent.com/etalab/schema.data.gouv.fr/4acda7bb7d1904617d87ddb1fe31ee3503a8c61b/decp-dpa/schema.json https://git.opendatafrance.net/validata/validata-core/uploads/c6b57d0ed336b74383886e330d5da81f/test.csv                       
Traceback (most recent call last):
  File "/home/cbenz/.local/share/virtualenvs/validata/bin/goodtables", line 10, in <module>
    sys.exit(cli())
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/goodtables/cli.py", line 104, in validate
    report = goodtables.validate(sources, **options)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/goodtables/validate.py", line 85, in validate
    report = inspector.inspect(source, **options)
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/goodtables/inspector.py", line 84, in inspect
    table_warnings, table_report = task.get()
  File "/usr/lib64/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
  File "/usr/lib64/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/goodtables/inspector.py", line 237, in __inspect_table
    errors += (check_func(row_cells) or [])
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/goodtables/checks/required_constraint.py", line 30, in required_constraint
    valid = valid and field.cast_value(value) is not None
  File "/home/cbenz/.local/share/virtualenvs/validata/lib/python3.7/site-packages/tableschema/field.py", line 100, in cast_value
    ).format(field=self, name=name, value=value))
tableschema.exceptions.CastError: Field "siretAcheteur" has constraint "pattern" which is not satisfied for value "792483364"

I think that this error should be caught, like it is in type_or_format_error.py.

roll commented 5 years ago

@cbenz Thanks. TBH I'm trying to understand now what these lines do at all for the required-constraint (commented):

        # Check constraint
        valid = field.test_value(value, constraints=['required'])
        #  if field.descriptor.get('primaryKey'):
            #  valid = valid and field.cast_value(value) is not None

I'll try to get rid of them completely if it's possible because as I understand it now the valid = field.test_value(value, constraints=['required']) line already checks for None values

roll commented 5 years ago

@cbenz Please try goodtables@2.2.1