OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.85k stars 2.53k forks source link

RasterIO sampling Offset and Size subpixel alignment / AREA_OR_POINT: “Area” vs “Point” #5498

Closed connorjak closed 6 months ago

connorjak commented 2 years ago

Expected behavior and actual behavior.

I use GDAL C++ to load GeoTIFF files, and use RasterIO to copy portions of them into a set of caches. These are used for heightmaps, and the edges of each cache square are meant to contain the same points as the corresponding cache square next to it. Mismatches result in visible discontinuities in my 3D graphics. So, I expect when I set the Offset and Size XY input for TL (using advanced arg floating point window) to 0.003 and 2.05, and 2.053 and 2.05 for BR, the resulting copied data should match exactly at X.

Corners of sampling areas: +
+-------+-------+ 0.003
|       |       |
|  TL   |   TR  |
|       |       |
+-------X-------+ 2.053
|       |       |
|  BL   |   BR  |
|       |       |
+-------+-------+ 4.103

TL: Offset 0.003, Size 2.05
BR: Offset 2.053, Size 2.05

As can be seen here https://github.com/OSGeo/gdal/blob/v3.4.2/gdal/gcore/rasterio.cpp#L775, I believe the AREA_OR_POINT: "Area" assumption is used for any RasterIO function call, which would mean the input points get skewed to the point where my above expectations are not met. I attempted to modify the AREA_OR_POINT metadata elsewhere, but I was not able to figure out the scripting to accomplish this.

GDAL has these operations going on inside RasterIO:

// nBufXSize, nBufYSize, and psExtraArg are input
dfXOff = psExtraArg->dfXOff;
dfYOff = psExtraArg->dfYOff;
dfXSize = psExtraArg->dfXSize;
dfYSize = psExtraArg->dfYSize;
// ...
const double dfSrcXInc = dfXSize / static_cast<double>( nBufXSize );
const double dfSrcYInc = dfYSize / static_cast<double>( nBufYSize );
// ...
const double EPS = 1e-10;
// ...
const double dfSrcXStart = 0.5 * dfSrcXInc + dfXOff + EPS;
// ...
double dfSrcX = dfSrcXStart;
for( int iBufXOff = 0;
     iBufXOff < nBufXSize;
     iBufXOff++, dfSrcX += dfSrcXInc )
{
    // Copy each pixel

I was able to resolve this issue by shifting my sampling coordinates, after much trial and error:

PixelCoord shifted_pixToSampleStart = pixToSampleStart;
PixelCoord shifted_pixToSampleEnd = pixToSampleEnd;

double dfXSize = shifted_pixToSampleEnd.U - shifted_pixToSampleStart.U;
double dfYSize = shifted_pixToSampleEnd.V - shifted_pixToSampleStart.V;
double dfSrcXInc = dfXSize / static_cast<double>(interimdata_U_res-1);
double dfSrcYInc = dfYSize / static_cast<double>(interimdata_V_res-1);
// -1 here to get the after-shifting full size in raster dataset pixel coordinates
// interimdata is the buffer I'm sampling into

shifted_pixToSampleStart.U += -0.5;
shifted_pixToSampleStart.V += -0.5;

shifted_pixToSampleEnd.U += dfSrcXInc - 0.5;
shifted_pixToSampleEnd.V += dfSrcYInc - 0.5;

GDALRasterIOExtraArg arg;
INIT_RASTERIO_EXTRA_ARG(arg);
arg.eResampleAlg = GDALRIOResampleAlg::GRIORA_Bilinear;
arg.bFloatingPointWindowValidity = 1; //TRUE
arg.dfXOff = shifted_pixToSampleStart.U;
arg.dfYOff = shifted_pixToSampleStart.V;
arg.dfXSize = shifted_pixToSampleEnd.U - shifted_pixToSampleStart.U;
arg.dfYSize = shifted_pixToSampleEnd.V - shifted_pixToSampleStart.V;

auto rasterIO_err = poBand->RasterIO(GF_Read,
//...

There should probably be an option in GDALRasterIOExtraArg to change from PIXEL_IS_AREA to PIXEL_IS_POINT sampling.

Steps to reproduce the problem.

  1. Load raster dataset
  2. RasterIO in edge-aligned squares as above
    • With bilinear resampling
    • Using floating point window
  3. Compare edge / corner values

Operating system

Windows 10 Pro 21H2 64 bit 19044.1586

GDAL version and provenance

vcpkg 3.4.1, exported 2022/02/11

connorjak commented 2 years ago

Mostly-related reading:

jratike80 commented 2 years ago

Does gdalinfo report your GeoTIFFs as PixelIsAreaa or PixelIsPoint? If you have wrong metadata in GeoTIFF then see here https://lists.osgeo.org/pipermail/gdal-dev/2019-December/051342.html how it can be fixed.

connorjak commented 2 years ago

Does gdalinfo report your GeoTIFFs as PixelIsAreaa or PixelIsPoint? If you have wrong metadata in GeoTIFF then see here https://lists.osgeo.org/pipermail/gdal-dev/2019-December/051342.html how it can be fixed.

Thanks for the link, I'll try out the scripts from there and see if I can change the behavior. From the browsing through RasterIO so far I haven't found code that takes AREA_OR_POINT into account, but maybe it's in an implementation besides the ones I looked through.

My geotiffs are AREA_OR_POINT=Area.

I guess there are two issues here:

  1. I haven't successfully changed a dataset to AREA_OR_POINT.
  2. Maybe having an option for specifying Point or Area alignment on GDALRasterIOExtraArg would be beneficial.
jratike80 commented 2 years ago

AREA_OR_POINT=Area is wrong for your data. Fix the metadata with gdal_edit -mo AREA_OR_POINT=Point point.tif --config GTIFF_POINT_GEO_IGNORE YES

You will notice that gdalinfo reports now different extents.

But do you want that point X in your sketch should be included in all four tiles TL, TR, BL, BR

connorjak commented 2 years ago

Yes, the edge pixels of each of the tiles should be coinciding. I use these pixels as vertex height values for a 3D mesh per cached tile.

Trying those commands...

jratike80 commented 2 years ago

One may think that you want to get tiles that overlap by one pixel row and therefore you must take that into account when you request the data. But the point vs. polygon dualism in expressing point data as raster pixels which on the other are small polygons which cover an area is difficult. Read some considerations https://github.com/opengeospatial/ogcapi-coverages/issues/92.

connorjak commented 2 years ago

Looks like gdal_edit is a python script, and with my environment (QGIS-installed OSGeo4W Shell) there are some missing/problematic environment variables. I am having similar problems to this issue, but I get 'python' is not recognized as an internal or external command, operable program or batch file, etc when trying their workarounds.

I'm confident I could get gdal_edit working from the command line after some finagling, but that doesn't match my current workflow. My automation for doing preprocessing work on all my datasets is already in python, using osgeo.gdal. Is there an equivalent for these metadata-editing commands in this python library? An alternative I saw suggested was to copy the gdal_edit.py source code and either call it in a subprocess or modify it to call as a python function.

Read some considerations https://github.com/opengeospatial/ogcapi-coverages/issues/92.

My needs match up with the PixelIsPoint assumption as stated in the issue you linked. I am already getting acceptable accuracy from my workaround, but there still is a benefit to removing the workaround for correctness and simplicity.

Another page I came across:

jratike80 commented 2 years ago

Do you still believe that you have found a bug in GDAL or are the main issues a) you have source data that has wrong AREA_OR_POINT metadata and b) you don't know how to fix it?

I am just a somewhat experienced GDAL user who knows nothing about programming but you seem to do. See the source code https://github.com/OSGeo/gdal/blob/master/swig/python/gdal-utils/osgeo_utils/gdal_edit.py and modify for your needs. Another option is to contact the DEM provider and ask is they would like to change the PIXEL_OR_AREA metadata in their product.

connorjak commented 2 years ago

I was suspecting a bug in GDAL (not reacting to AREA_OR_POINT) but now I'll need to try it again with the proper metadata to check if there is a real bug. Thanks for your help in finding the scripts for modifying the metadata.

I also have a feature request for an option to change between PixelIsArea and PixelIsPoint coordinate mapping for specifying RasterIO window extents with an additional argument in RasterIO's GDALRasterIOExtraArg.

jratike80 commented 2 years ago

The pixel_is_point/pixel_is_area support that was added to GDAL 11 years ago is based on the assumption that the metadata in GeoTIFF is correct https://trac.osgeo.org/gdal/wiki/rfc33_gtiff_pixelispoint. I know that there are other GDAL drivers that take care of the half a pixel shifts if the format requires that. I don't know if any other format than GeoTIFF can support either points or areas. If I understand right the additional argument would be needed only for handling bad source data but let's wait if developers have something to say.

If what you need is an overlap of one pixel row between adjacent tiles, have you ever considered to do something like minX(second tile) = maxX(first tile) - pixelXsize?

rouault commented 2 years ago

There should probably be an option in GDALRasterIOExtraArg to change from PIXEL_IS_AREA to PIXEL_IS_POINT sampling.

The whole raster model of GDAL is written for the PIXEL_IS_AREA / TopLeftOrigin convention.For example, when reading a GeoTIFF file encoded in PixelIsPoint (Pixel Center) we shift it by a half-pixel to go to the PixelIsArea one. If the user works in term of PixelsPoint/PixelCenter, it is expected that they have to apply this shift themselves, like you figured out. Adding a new field to GDALRasterIOExtraArg would have implications, because drivers can potentially handle themselves the resampling (e.g the VRT driver also manipulates the floating point window, so it might be impacted).

rouault commented 6 months ago

closing that one as I don't think there's any action required on it