investigate high memory usage for agera5 at sentinel-2 resolution

job id: j-f1b1efdb2c6e4fc680f1ddedde0b5f91 user had to set executor memory very high

Jobs were crashing when writing the actual netcdfs. filter_spatial was used, so all data for a single netcdf ends up on an executor. The size of netcdf's was: 25625636648/(1024*1024) = 732MB

relevant logging:

Creating layer for AGERA5 with load params {'temporal_extent': ('2020-01-01', '2020-12-31'), 'spatial_extent': {'west': 3.016675851392828, 'south': 44.25828932897241, 'east': 4.37861196995034, 'north': 45.15020643463023, 'crs': 'EPSG:4326'}, 'global_extent': {'west': 3.016675851392828, 'south': 44.25828932897241, 'east': 4.37861196995034, 'north': 45.15020643463023, 'crs': 'EPSG:4326'}, 'bands': ['dewpoint-temperature', 'precipitation-flux', 'solar-radiation-flux', 'temperature-max', 'temperature-mean', 'temperature-min', 'vapour-pressure', 'wind-speed'], 'properties': {}, 'aggregate_spatial_geometries': <shapely.geometry.multipolygon.MultiPolygon object at 0x7febd6690520>, 'sar_backscatter': None, 'process_types': {<ProcessType.FOCAL_SPACE: 6>}, 'custom_mask': {}, 'data_mask': None, 'target_crs': {'$schema': 'https://proj.org/schemas/v0.2/projjson.schema.json', 'type': 'GeodeticCRS', 'name': 'AUTO 42001 (Universal Transverse Mercator)', 'datum': {'type': 'GeodeticReferenceFrame', 'name': 'World Geodetic System 1984', 'ellipsoid': {'name': 'WGS 84', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257223563}}, 'coordinate_system': {'subtype': 'ellipsoidal', 'axis': [{'name': 'Geodetic latitude', 'abbreviation': 'Lat', 'direction': 'north', 'unit': 'degree'}, {'name': 'Geodetic longitude', 'abbreviation': 'Lon', 'direction': 'east', 'unit': 'degree'}]}, 'area': 'World', 'bbox': {'south_latitude': -90, 'west_longitude': -180, 'north_latitude': 90, 'east_longitude': 180}, 'id': {'authority': 'OGC', 'version': '1.3', 'code': 'Auto42001'}}, 'target_resolution': [10, 10], 'resample_method': 'cubic', 'pixel_buffer': None}

Loading with params DataCubeParameters(256, {}, FloatingLayoutScheme, ByDay, 6, None, CubicConvolution, 0.0, 0.0) and bands dewpoint-temperature;precipitation-flux;solar-radiation-flux;temperature-max;temperature-mean;temperature-min;vapour-pressure;wind-speed initial layout: LayoutDefinition(Extent(501310.0, 4898170.0, 613950.0, 5000570.0),CellSize(10.0,10.0),22x20 tiles,11264x10240 pixels)

Cube partitioner index: SparseSpaceTimePartitioner 1656 true

Datacube is sparse: true, requiring 46 keys out of 420.

Open-EO / openeo-geopyspark-driver

investigate high memory usage for agera5 at sentinel-2 resolution #413