Closed roll closed 7 years ago
@roll
In the original code that spawned these libs, fail fast was used to stop iteration over a table on first error, as opposed to collecting all errors and returning all. This is about work speed, and is not just about a different API. Consider a table with 1GB of data and errors in the 100th row, and then other errors in 1 millionth row and beyond. With fail fast, the iteration will stop and raise almost immediately.
I don't know if the fail_fast flag is now applied in other places, but, for table iteration in jts, dp and good tables, it makes perfect sense IMHO.
@pwalsh
Yes but now we have iter
concept where fail_fast seems not applicable. This thing is not fully clear for me for now. There is also - https://github.com/frictionlessdata/tabulator-py/issues/98
Some thinking needed)
But I suppose fail_fast in descriptor validation is not really important concept. We just need to find the best API to show errors. JS and Ruby libs just get all errors as a list. It seems handy.
in descriptor validation it is not relevant at all, agreed.
in table iteration, I don't see why iter
concept makes any difference (in the original implementation it was the same).
@pwalsh Let's sync on this when we will be on next tabulator/jts iteration. I've added issue after working on https://github.com/frictionlessdata/datapackage-js/issues/13 - to have impl in sync
COVERED in separate issues
Overview
For now
fail_fast
is used in:Both are not about work speed because even without
fail_fast
row or descriptor will be processed very fast. So it's only for provide different API based on this flag.E.g. if
fail_fast=True
validate will raiseValidationError
otherwiseMultipleInvalid
witherrors
attribute. This API seems complicated. JavaScript library doesn't use this flag: https://github.com/frictionlessdata/jsontableschema-js#validateSo we could simplify API always raising list of errors for this functions (e.g. in
ValidationErrors
andCastErrors
exceptions).