Closed lukegre closed 3 years ago
I'll give this a look now and see what the deal is...
Okay, just putting some notes here as I work through this.
Replicating a broken test.... this breaks it:
def gridded_da_float():
"""Mock data of gridded time series in float time."""
# Wrapper so fixture can be called multiple times.
# https://alysivji.github.io/pytest-fixures-with-function-arguments.html
data = np.random.rand(60, 3, 3)
da = xr.DataArray(data, dims=["time", "lat", "lon"])
# Annual resolution time axis for 60 years.
da["time"] = np.arange(1900, 1960)
return da
# false - float - false
data = gridded_da_float().chunk()
x = data["time"]
y = data.isel(lat=0, lon=0)
X, _ = esmtools.stats._convert_time_and_return_slope_factor(x, "time")
Y = y
def _rm_poly(x, y, order, nan_policy):
print(x)
fit = esmtools.stats._polyfit(x, y, order, nan_policy)
return y - fit
dim = "time"
test = xr.apply_ufunc(
_rm_poly,
X,
Y,
2,
"none",
vectorize=True,
dask="parallelized",
input_core_dims=[[dim], [dim], [], []],
output_core_dims=[[dim]],
output_dtypes=[float],
)
dask='allowed'
fixes it but causes other tests to break. The weird thing is that with dask='parallelized'
here and dask
arrays, it passes empty arrays to the _rm_poly
func. You see with the print statement in _rm_poly
an empty list for parallelized and a full list for allowed. I haven't seen this behavior before. I don't know why it's doing this.
See https://github.com/bradyrx/esmtools/pull/102. This should fix it. Looks like there was some dask
or xarray
update that stays true to vectorize
. So if you pass a single time series in with input_core_dims=['time']
and vectorize=True
it doesn't know how to split it up.
With a gridded product, it passes in each grid cell separately. This just detects whether it's gridded. This seems like a kludge but should be fine.
Overview
Tests for
stats.rm_poly
fails when x and y are given and input array when y is a dask array. Encountered on the PR #99 resulting in 12 failed tests.The test conditions below result in failure:
esmtools/tests/test_stats_poly.py::test_rm_poly_against_time_dask[<any>-<any>-False]
For more info see: https://travis-ci.com/github/bradyrx/esmtools/builds/198414251The tests pass if
y.load()
is forced. So somehow dask is the culprit, though not sure how to feasibly solve the problem.Traceback