geosolutions-it / jai-ext

Java Advanced Imaging Open Source Replacement Wannabe
Apache License 2.0
89 stars 38 forks source link

MosaicOp excessive memory usage when mosaicking many little images #285

Closed aaime closed 1 year ago

aaime commented 2 years ago

Looking into a high memory pressure issue, I've found that image mosaic mode of operation causes excessive memory allocation.

Imagine a case where there are many small input images (hundreds) forming a seamless mosaic. The request in question asks to moaic all the little input images, which have overviews: only a handful of pixels are read from each input image.

However, when starting the calculation, mosaic sets up a Raster[] of inputs, each one grabbed from the source image using getData(), and providing as the raster area the entire output area. So instead of being a 2x2 image, each raster is a 1024x768 image. Have a few hundreds of those, and the memory allocation for these rasters quickly grows to excess. The reason why mosaic works like this, is because it allows to have all rasters with the same raster space location, while reading only the pixels actually needed would require to keep an offset for each raster.

This issue is related also to https://github.com/geosolutions-it/jai-ext/issues/151, where we noticed that in case of overlapping inputs, some are read without necessity.

When we address them, we should probably do the together, and create an intermediary object wrapping the source, that knows how to do offsets, and also delays reading the data to the first time it's actually needed (and maybe also leverage the tile cache, instead tf sticking in memory all the references as hard ones?).

aaime commented 1 year ago

Memory issue fixed, #151 is still valid, although a bit easier to solve now with the work done to fix the memory usage.