locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.32k stars 360 forks source link

RDD reproject does not correctly apply resampleMethod #3541

Open jdries opened 1 month ago

jdries commented 1 month ago

Describe the bug

I can consistently reproduce a case where reprojecting an RDD to a lower resolution does not compute correct pixel values. The output selects one of the pixels, rather than respecting the sample method.

It happens for any method that uses the 'cs' argument here: https://github.com/locationtech/geotrellis/blob/master/raster/src/main/scala/geotrellis/raster/resample/Resample.scala#L102 So Sum, Average, min, Mode, ... are all impacted.

I believe the bug is in this line: https://github.com/locationtech/geotrellis/blob/master/raster/src/main/scala/geotrellis/raster/reproject/RasterRegionReproject.scala#L268

Because cellsize is supposed to be the target cellsize, while on that line, the source cellsize seems to be passed on. Hence, any subsequent computation in the resamplemethod just select one pixel where they are supposed to work on multiple pixels.

To Reproduce

Create an rdd, and use the 'reproject' method with any of the above mentioned resample methods. Make sure to resample to a lower resolution (so larger cellsize), so that multiple pixels are normally selected.

Expected behavior

resample methods should behave as advertised.

Environment

latest geotrellis version

pomadchin commented 1 month ago

Thank you! It makes sense, do you want to make a PR to address it?

I’m 🌴 atm, so won’t be able to assist quickly, but we can include it into the upcoming release.