lemma-osu / sknnr-spatial

https://sknnr-spatial.readthedocs.io
0 stars 0 forks source link

Allow for more flexible application of ufuncs across images #27

Closed aazuspan closed 2 months ago

aazuspan commented 3 months ago

This was proposed by @grovduck to leverage the raster processing workflow in the ImageWrapper classes (flattening, filling NaNs, building NoData masks, applying a function, unflattening, and masking NoData) outside of estimator methods. The initial goal was to calculate weighted attributes from a pre-computed neighbor image using Dask, but ideally it would work for any function that takes and returns 2D Numpy arrays in the shape (pixel, band).

We should be able to tackle this by creating a higher-order function like apply_ufunc_across_bands that takes 1) an image (NDArray, xr.Dataset, xr.DataArray), 2) a compatible ufunc, and 3) some metadata arguments to define the output image (e.g. dimension sizes, data types), and uses xarray.apply_ufunc to parallelize that operation across chunks. The pre- and post-processing would be handled by an HOF in ImageWrapper that pre-processes, applies the ufunc, and post-processes each Numpy chunk.

Compared to the current approach of pre- and post-processing each image type natively (i.e. xr.DataArrays are flattened as xr.DataArrays), being able to process everything at the Numpy level would simplify things substantially. We would of course still need some type-specific behavior like parsing nodata, assigning coordinate names, etc.

Performance is an important, unknown question here that may determine whether this is worth pursuing. I did a quick test that suggested that prediction with a small array was substantially quicker with the proposed approach compared to the current implementation, but we'll need to see how that scales and whether it applies to other operations. If it ends up slowing things down, we may need to reconsider other alternatives to exposing the processing API.

grovduck commented 3 months ago

@aazuspan, this is awesome to see! Sorry I was out of the office today - I can engage on this next week.