Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

CDSE: set smaller tile size for processing #311

Closed jdries closed 2 weeks ago

jdries commented 1 month ago

We notice that there's some benefits to processing with a smaller tile size. We previously tried to increase it, but this was because data loading was faster for large tiles. For the rest of the processing, we see that smaller tiles result in fewer memory issues, so perhaps we can now use a 64px tilesize as default if possible, or else derive it from apply_neighborhood parameters.

jdries commented 1 month ago

We can do something rather general in layercatalog.py:

elif(get_backend_config().default_reading_strategy == "load_per_product"):
            datacubeParams.setLoadPerProduct(True)
            if "tilesize" not in feature_flags:
                #when doing load_per_product, tilesize does not affect read_performance, and smaller chunks are better for memory usage
                getattr(datacubeParams, "tileSize_$eq")(128)

not committing this now, as it requires some followup. Maybe even better is to make this 'default' chunk size a parameter in the backend config or have it as a job option rather than custom feature flag.

jdries commented 4 weeks ago

configured new default of 128 on dev/staging