Open TomNicholas opened 1 month ago
+1 for the direction.
Presumably the code in #4 will eventually end up in cubed-xarray, but you are keeping it here while the design evolves?
yes, indeed, it helps to have multiple examples while writing up the actual tests. So far we also expect to move to xarray
and archive this repository once we're confident that the structure of the testing framework is okay.
By the way, the tests in #4 already exposed some issues in cubed
/ cubed-xarray
(not sure which, could also be xarray
).
Hi @zac-HD and @tomwhite :wave:
@keewis and I spent today trying to design a duck array test suite for xarray using hypothesis. This (temporary) repo contains the sketch of what we're trying to do, with the eventual aim being that this sort of code lives upstream in xarray and in downstream duck array libraries like pint/dask/sparse/cubed etc.
The problem we are trying to solve is that we want to create a test suite that can be used to test xarray's wrapping of any duck array type, including cubed and pint as representative examples. We want this test suite to:
scipy.skew
via a new reduction methodda.skew
, we would want to be able to add a test to this to the test suite for duck array libraries just by changing code in the xarray repository.cubed-xarray
andpint-xarray
). That way xarray devs don't have to maintain all the tests, and failures are first reported downstream, and it's on the devs of those packages to report any failures upstream in xarray if they think its actually xarray's fault.TestDatasetReductions
class and that automatically runs many different reductions on many different xarray objects..magnitude
, andcubed/dask
requires calling.compute
)cubed.mean()
currently doesn't support taking means of integers (because it's not actually required by the array API standard), butxarray.DataArray.mean()
expects this to be possible. So we want to the test suite to test means of integers but give cubed's downstream tests the opportunity to mark that case as an expected failure.(4) is the reason why we are using hypothesis, and why we made the hypothesis strategies for generating arbitrary
xarray.Variable
objects.(4), (5) and (6) are the most difficult parts of this to achieve simultaneously. We need a lot of control over test cases upstream in xarray, but also give a lot of control to the downstream tester to override things.
We would appreciate it if you could take a look at what we have done (both in
main
and in #4) and tell us if you think we are headed in a good direction or not?xref https://github.com/pydata/xarray/pull/4972 https://github.com/pydata/xarray/pull/4972 https://github.com/pydata/xarray/pull/6908 https://github.com/cubed-dev/cubed-xarray/issues/20