It would be really useful if there was a configuration option for the Table.read() method that would allow for non-blocking execution - i.e. that would not stop on the first error but rather return a collection of all errors from the validation of the full stream.
There are scenarios that it is really difficult to have users fix one row at a time, plus it adds up effort and complexity on the integrator's side as the only way to mimic such behavior is through a recursive approach which is inefficient since it involves opening the stream multiple times and start reading from the line after the last failed one.
Overview
For now
table.read
fails with atableschema.Error
on the first cast error. We'd like to have an ability to get all errors from thetable.read()
call.Here is an example how it's implemented for
tabulator
with aforce_parse
option - https://github.com/frictionlessdata/tabulator-py#force-parse.We e.g. could use a
force_cast
option.From @spilio
It would be really useful if there was a configuration option for the
Table.read()
method that would allow for non-blocking execution - i.e. that would not stop on the first error but rather return a collection of all errors from the validation of the full stream.There are scenarios that it is really difficult to have users fix one row at a time, plus it adds up effort and complexity on the integrator's side as the only way to mimic such behavior is through a recursive approach which is inefficient since it involves opening the stream multiple times and start reading from the line after the last failed one.