Open jeremiah-corrado opened 1 month ago
looking at the docstring of the chunk manager base class, it appears this is not used for all reductions: https://github.com/pydata/xarray/blob/d26144db962d9d620c7f7c4e7398c2f6b3bf6d47/xarray/namedarray/parallelcompat.py#L386-L390
In other words, this seems to be expected behavior? I wonder if var.reduce(np.mean)
would go through the reduction method, though.
In other words, this seems to be expected behavior?
Reminding myself of how this works, I think yes, this is expected. The array type knows how to do a chunked mean
on a chunked array without any further guidance from xarray, so it's fine to just use the array API to call the correct implementation of mean
.
I wonder if
var.reduce(np.mean)
would go through the reduction method, though.
It won't (see NamedArray.reduce
)
but I also don't think it needs to either. reduce
is really a special case of the more general reduction
primitive. reduction
accepts the func
kwarg that reduce
accepts, but also accepts combine_func
and aggregate_func
. Basically these allow tree-reductions made of alternating steps, rather than just the same function applied over and over. This comes from dask.array.reduction
, and is needed in the chunkmanager because it is used in xarray (only once though apparently, in the implementation .first
and .last
methods).
If arkouda chose not to implement ArkoudaChunkManager.reduction
I think the only thing that would break would be the .first
and .last
methods. .reduce
would in general work, but only if the function you passed into .reduce
knew how to handle the Arkouda Array type.
So actually other than documenting / testing this a bit better I believe there is not actually a bug to be fixed here?
I think the only thing that would break would be the .first and .last methods. .reduce would in general work, but only if the function you passed into .reduce knew how to handle the Arkouda Array type.
Okay that makes sense. Thank you both for digging into the implementation/docs to figure that out. I'll work on implementing reduction
properly in Arkouda's chunk manager so that those methods work.
I believe there is not actually a bug to be fixed here?
I think that's correct
What happened?
The arkouda-xarray chunk manager (found here) is designed to allow Arkouda's distributed arrays to operate as chunked arrays in XArray. (Note Arkouda's array's also implement the array API, which was being used for XArray interoperability before the chunk manager was created).
When executing various operations on XArray DataArrays or DataSets that are backed by the Arkouda chunk manager, I noticed that they are not being routed through the chunk manager itself. For example, calling
myArkoudaBackedArray.mean(axis="some-axis")
does not get routed throughArkoudaManager.reduction
. However, the operation is successful and produces the correct result, indicating that the array-api implementation ofmean
is being used instead (i.e., the arkouda-backed array is being treated as a duck array).https://github.com/jeremiah-corrado/arkouda-xarray-seasonal-avgs-example/blob/main/README.md contains a full example where I observed the above behavior. It includes instructions to get Arkouda set up for XArray interoperability.
pinging @TomNicholas, since we discussed this the other day.
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment