`ValueError` when passing any `dim` to `mosaic`

aazuspan commented 2 years ago

Because stackstac.mosaic defaults to axis=0, passing any dim causes DataArray.reduce to raise ValueError: cannot supply both 'axis' and 'dim' arguments.

Reproducible with:

import xarray as xr
import stackstac

ds = xr.tutorial.open_dataset("air_temperature").to_array()
stackstac.mosaic(ds, dim="variable")

Passing axis=None is a workaround, but that's not documented or immediately apparent from the error.

I'm happy to make a PR for this. My thought is to set axis = None if dim is not None else axis to ensure both aren't passed, and update the docstring to make it clear that dim overrides axis.

gjoseph92 commented 2 years ago

Thanks for the catch. Having dim override axis seems reasonable enough. A PR would be great! If you do, mind testing in the existing tests in test_mosaic.py? It was definitely an oversight to not test the dim= parameter in test_fuzz_mosaic. If I'd done that, we would have caught it earlier. Maybe something like:

diff --git a/stackstac/tests/test_mosaic.py b/stackstac/tests/test_mosaic.py
index 9378f99..00e630f 100644
--- a/stackstac/tests/test_mosaic.py
+++ b/stackstac/tests/test_mosaic.py
@@ -46,9 +46,14 @@ def test_mosaic_dtype_error(dtype: np.dtype):
     st_stc.raster_dtypes,
     st_np.array_shapes(max_dims=4, max_side=5),
     st.booleans(),
+    st.booleans(),
 )
 def test_fuzz_mosaic(
-    data: st.DataObject, dtype: np.dtype, shape: Tuple[int, ...], reverse: bool
+    data: st.DataObject,
+    dtype: np.dtype,
+    shape: Tuple[int, ...],
+    reverse: bool,
+    use_dim: bool,
 ):
     """
     See if we can break mosaic.
@@ -65,7 +70,8 @@ def test_fuzz_mosaic(
     chunkshape = data.draw(
         st_np.array_shapes(
             min_dims=arr.ndim, max_dims=arr.ndim, max_side=max(arr.shape)
-        ), label="chunkshape"
+        ),
+        label="chunkshape",
     )
     axis = data.draw(st.integers(-arr.ndim + 1, arr.ndim - 1), label="axis")

@@ -73,8 +79,13 @@ def test_fuzz_mosaic(
     split_every = data.draw(st.integers(1, darr.numblocks[axis]), label="split_every")
     xarr = xr.DataArray(darr)

+    if use_dim:
+        kwargs = dict(dim=xarr.dims[axis])
+    else:
+        kwargs = dict(axis=axis)
+
     result = mosaic(
-        xarr, axis=axis, reverse=reverse, nodata=fill_value, split_every=split_every
+        xarr, reverse=reverse, nodata=fill_value, split_every=split_every, **kwargs
     )
     assert result.dtype == arr.dtype
     result_np = mosaic(xr.DataArray(arr), axis=axis, reverse=reverse, nodata=fill_value)

aazuspan commented 2 years ago

Yeah, I'll add that coverage. I haven't worked with hypothesis before, but I've been looking for a reason to try it out. Thanks for outlining the changes.

gjoseph92 / stackstac

`ValueError` when passing any `dim` to `mosaic` #148