natcap / pygeoprocessing

Geoprocessing operations for Python
75 stars 7 forks source link

Proposal: options for `zonal_statistics` rasterization algorithm #312

Open emlys opened 1 year ago

emlys commented 1 year ago

how it works now

zonal_statistics rasterizes aggregate polygons. GDAL provides two options:

zonal_statistics uses ALL_TOUCHED=False. Some aggregate polygons can end up with no pixels at all (if they don't happen to overlap any pixels' centerpoint). zonal_statistics handles this case like so:

for each polygon with no pixels:
    calculate the polygon's bounding box
    read in the raster data within that bounding box
    calculate stats based on that window of data

proposed changes

Add a kwarg all_touched=False to zonal_statistics. Pass this value to gdal.RasterizeLayer. Remove the special handling of unset polygons.



dcdenu4 commented 1 year ago

James shared a link for the gdal rasterization source:

dcdenu4 commented 1 year ago

Dave shared some background from what QGIS is doing: Talked about in this SO post:

davemfish commented 1 year ago

emlys commented 1 year ago

We determined that this needs some more information to make a decision -

davemfish commented 1 year ago

Here's another zonal stats library to keep an eye on. I think python bindings are in the works.

Their readme also includes a comparison of other implementations.