OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.78k stars 2.5k forks source link

gdal_rasterize ALL_TOUCHED option seems to do the opposite of what I'd expect? #10649

Closed GeoSander closed 1 week ago

GeoSander commented 2 weeks ago

What is the bug?

Perhaps this is not a bug, and I may be missing something here (in which case something may actually be missing from the documentation), but here's my problem:

When running gdal_rasterize using the -at flag (ALL_TOUCHED), I would expect to get a raster that typically has more burned pixels than if I were to omit that flag. However, what I observe is the opposite: without the -at flag, I get more burned pixels (i.e. larger clusters) than without it.

I performed the same rasterization in QGIS (Processing toolbox) to demonstrate the issue:

image

The white area has no data, the yellow area has a value of 0 (which is not no data in my case), and the red line pattern area has a value of 250. Note that the yellow and red areas are multipolygons: in this case, there are only 2 features in this dataset.

The blue pixels show the result of running gdal_rasterize with the -at flag. The green area shows the pixels (in addition to the blue ones) that are the result of running gdal_rasterize without the -at flag (so cell center only).

Steps to reproduce the issue

Here is a zipped GeoPackage that contains a polygon feature layer (spray_area). The polygon layer is projected (EPSG:28992) and only contains 2 multipart features.

Try this command for the default option (cell center):

gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -tap -l spray_area values.gpkg cell_center.tif

Then try this command for the ALL_TOUCHED option:

gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -at -tap -l spray_area values.gpkg all_touched.tif

Versions and provenance

GDAL version 3.9.1 on Windows 11 (from OSGeo4W).

Additional context

I was thinking: because both features are adjacent in my case (i.e. fully touch each other), there may be a burn value conflict at the edges. In that case, is there a rule that states how gdal_rasterize determines the pixel value to burn? For example, does the value of the first feature it encounters win?

GeoSander commented 2 weeks ago

Spam bots..?

rouault commented 2 weeks ago

Spam bots..?

likely... We had a similar issue a few years ago in another repository

rouault commented 2 weeks ago

Locking conversation due to spam bombing...

rouault commented 2 weeks ago

is there a rule that states how gdal_rasterize determines the pixel value to burn? For example, does the value of the first feature it encounters win?

This is actually the contrary. The last encountered features wins, so here as the features appear in the order 250, 0, this is 0 which wins, which perfectly explains the results you get

rouault commented 2 weeks ago

(unlocking)

In that case I will have to order my features prior to running gdal_rasterize, I guess.

You could also use -sql "SELECT * FROM spray_area ORDER BY spray_rate"

GeoSander commented 2 weeks ago

Thanks for your help @rouault!

GeoSander commented 2 weeks ago

Hi @rouault, unfortunately I need to reopen this issue. I tried ordering the features using the SQL statement, but it seems to have no effect (whether I am using ASC or DESC). No errors either though. The areas with value 0 still "win" and the area with value 250 becomes too small (smaller than without -at).

Just to see if -at does work at all, I removed the 0 polygon completely. In that case, the 250 area did grow larger, so that fits.

Any ideas on how to proceed? I was thinking of calling the command multiple times using -add, but not sure if that actually works and it's also not very efficient (there can potentially be a lot of polygons in the dataset with different values).

jratike80 commented 2 weeks ago

Perhaps you should initialize the raster for example into 255 with https://gdal.org/en/latest/programs/gdal_rasterize.html#cmdoption-gdal_rasterize-init. Maybe all cells are now initialized into zero and burning with the 0 polygon does effectively nothing.

GeoSander commented 2 weeks ago

@jratike80 but that's not what I see. The 0 polygon takes precedence over the 250 polygon when -at is applied, so it is definitely burned. Nevertheless, I could try your suggestion to see if it matters.

jratike80 commented 2 weeks ago

What I meant is that if you burn value 0 over initial value 0 then nothing changes, even if values 0 are definetely burned. If you initialize the raster into anything else than 0 or 250 you can sort the pixels into 3 categories: burnt by 0 polygon, burnt by 250 polygon, or not hit at all. But my thinking may be wrong.

jratike80 commented 1 week ago

I had a test with these commands:

gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -at -tap -init 126 -sql "SELECT * FROM spray_area ORDER BY spray_rate asc" values.gpkg all_touched_asc.tif
gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -at -tap -init 126 -sql "SELECT * FROM spray_area ORDER BY spray_rate desc" values.gpkg all_touched_desc.tif
gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -tap -init 126 -sql "SELECT * FROM spray_area ORDER BY spray_rate asc " values.gpkg cell_center_asc.tif
gdal_rasterize -a spray_rate -a_nodata 255 -ot Byte -tr 0.25 0.25 -tap -init 126 -sql "SELECT * FROM spray_area ORDER BY spray_rate desc " values.gpkg cell_center_desc.tif

I had a visul look at some areas and for me the results look as expected. In the all_touched_desc there are more 0 pixels, in all_touched_asc there are more 250 pixels.

The -init value was not important, the geometries cover the whole sprayed area and all the pixels get burned. I think it is still good to have a distinct background value.

Ascending order (blue=0, green:250, red=init):: kuva

Descending order kuva

GeoSander commented 1 week ago

Never mind @jratike80, I figured it out eventually. But thanks for your help anyway!

I am an idiot: I forgot to remove the -l <layername> option, which meant that the -sql option was always being ignored... šŸ¤¦

Made this PR to update the docs and warn for that, while also mentioning that the sort order is important when -at is active. May prevent others from running into trouble...