Closed jkoell closed 1 month ago
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!
I get an error that seems to suggest that the positional arguments are passed as chunks, while the keyword argument passes the entire DataArray at once.
Yes exactly. You'll need to pass it positionally.
Ok, thanks for the quick response!
Is there any way to get around passing them positionally? As I mentioned above these are keyword-only arguments, i.e. I can't define them positionally. Obviously in my simple example I could just rewrite the squared_sum
function to accept them as positional arguments rather than keyword-only. But in my actual code I'm using a repo developed by someone else, so I would prefer If I don't have to change the underlying function
You can use a tiny wrapper or adapter function to translate between positional and keyword args
That worked, thanks! Below is the solution for my minimal example, in case anyone is searching for the problem in the future and comes across this thread:
def ss_wrapper(x1, x2):
return (squared_sum(x1, x2=x2))
out2 = xr.apply_ufunc(ss_wrapper, data_ch['x1'], data_ch['x2'], dask='parallelized')
What happened?
I've been trying to make my code more memory eifficient, and as part of that am trying to work more with chunked arrays and
apply_ufunc
. I'm using a function that does calculations on my dataset and returns new variables with the same dimensions. The function has several inputs with dims[time, lat, lon]
that are taken as positional arguments, and some optional ones with the same dimensions that are taken as keyword-only arguments.When I input data that are chunked in time, I get an error that seems to suggest that the positional arguments are passed as chunks, while the keyword argument passes the entire DataArray at once. E.g. in the minimum example below, the error is
operands could not be broadcast together with shapes (10, 5, 8) (100, 5, 8)
I couldn't find anything in the
apply_ufunc
documentation or theapply_ufunc
tutorial that discussed this specific problemWhat did you expect to happen?
I expected all data to be passed as chunks, and return a chunked DataArray of the same size as the output
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
I also tried setting
dask_gufunc_kwargs={'allow_rechunk':True})
, but still receive the same errorEnvironment