Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

sparse sampling out of memory on datacube construction #247

Closed jdries closed 8 months ago

jdries commented 8 months ago

To write a test for this, we need to mock a call that returns a lot of products: https://catalogue.dataspace.copernicus.eu/resto/api/collections/Sentinel2/search.json?box=-1.3258149867294418%2C52.34731665664588%2C1.6910298219426412%2C54.32658696166628&sortParam=startDate&sortOrder=ascending&page=1&maxRecords=2000&status=ONLINE&dataset=ESA-DATASET&productType=L2A&cloudCover=[0%2C75]&startDate=2022-01-01T00%3A00%3A00Z&completionDate=2022-12-31T00%3A00%3A00Z

Then we need to perform sparse sampling, and see if we can construct a datacube within a reasonable time.

in actual job, the RDD used 12GB so that would need to be reduced

jdries commented 8 months ago

Fixed on staging! Memory use got reduced quite a lot for sparse sampling case, dense sampling might still require a bit more.