Open tovrstra opened 3 years ago
I checked out the issue on np.isclose
. I agree that the default atol
should be zero, because it is a potential source of bugs and major confusion. (See NumPy issue for details)
Looking further at the issue, there are a few ways to perform the comparison:
# np.isclose
abs(a - b) <= (atol + rtol * abs(b))
# math.isclose (symmetric form)
abs(a - b) <= max(rtol * max(abs(a), abs(b)), atol)
# third option
abs(a - b) <= max(rtol * abs(b), atol)
Any of these would be acceptable, as long as the default atol is zero. Some are just slightly nicer than others.
For our use case, the third option might be conceptually ideal, because it avoids the weird sum used innp.isclose
and it reflects correctly that b is somehow special, because it is the expected value. The symmetric form is the nicest when the the compared numbers have somehow equal status. The first form may occasionally confuse the user, but that risk is small.
I'd be pragmatic and not care too much about the three different forms, just impose a default atol
of zero, and use np.isclose
.
I'm, copying the relevant bits from #72 below:
It is possible and fairly easy to support the functionality of
dataframe_regression
,num_regression
andndarrays_regression
in one regression fixture. E.g. to be used as follows:Future extensions could include support for HDF5 or user-defined container formats. The name
array_regression
is intended to be general, i.e. no reference to the dtype, leaving room for future support of more dtypes.This could also be a good opportunity to revise default tolerances in the new API as mentioned by @tadeu:
With the introduction of a new API like array_regression, it would be a good moment to improve the default tolerances. Currently we use the same defaults of
numpy.isclose
, and having anatol
that is not 0.0 by default is confusing (see numpy/numpy#10161 for example, we might even consider something different than isclose).Other changes can be considered too, obviously. (E.g. NPZ as default, because it supports more array shapes.) Feel free to comment.
For code reuse, it could be preferable to rewrite fixtures
dataframe_regression
,num_regression
using the same underlyingArrayRegressionFixture
class. This is probably not strictly necessary.The
ndarrays_regression
fixture can eventually be removed because it was not part of any release yet (at the moment).