Open perrygeo opened 9 years ago
Work started in multiband
branch
Still need:
band_num
to band
change[1, 2, 3]
array.mean(axis=(-1, -2)) # Works; the mean of the last two axes
masked.mean(axis=(-1, -2)) # TypeError: tuple indices must be integers, not tuple
There is an open PR (https://github.com/numpy/numpy/pull/5706) that serves as a proof of concept fix. This would need to be applied to all the functions used by the stats:
min, max, mean, count, std, sum, median, unqiue, percentile
Maybe looping through the bands is the only viable approach for now - even if we could make those changes and get them merged, it will be a while before they make it into mainstream releases.
+1 for reopening this one.
For my use case it would be crucial to have some multiband feature. For instance, I would like to perform stand based statistics on vector values, e.g. pixel brightness (combination of RGB values), pixel hue, etc. These cannot be independently calculated by bands and combined later, the transformation must be before aggregation.
(Returning vector values would be also nice, but this could be worked around, calculating vector, and return selected vector component.)
zonalstats
was originally optimized for large vector feature collections against a single raster band (i.e. thousands of polygons against a DEM raster)There is another use case which
zonalstats
doesn't handle so well: many rasters/bands against a single geometry (i.e. the value at a single point for thousands of climate model simulations or a multispectral image). You can loop through and call zonal stats on each raster/band but this incurs the overhead of the rasterio.open each time. A more efficient mechanism might be to stack all the data into a multi-band raster and open it once, slice the 3D array and don't apply the aggregate functions to the 3rd dim - return an array rather than a scalar statistic.I'll have to do some research to see how much refactoring/code clutter this would cause. Otherwise, it might just be a new
zonalstats_multiband
function.