Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
25 stars 4 forks source link

`aggregate_spatial` download not following requested save_result format (e.g. GeoJSON)? #620

Closed soxofaan closed 3 weeks ago

soxofaan commented 6 months ago

user wants to dowload result of aggregate_spatial in GeoJSON format (or something else) that allows inspecting the geometries.

example:

setup

import openeo
con = openeo.connect("openeo-dev.vito.be").authenticate_oidc()

cube = con.load_collection(
    "SENTINEL2_L2A",
    temporal_extent="2023-09",
    bands=["B02"],
)

geometry = {
    "type": "Polygon"
    "coordinates": [[[5.076, 51.22], [5.076, 51.216], [5.084, 51.217], [5.08, 51.221], [5.076, 51.22]]],
}

Spatiotemporal

vc = cube.aggregate_spatial(geometry, reducer="mean")
vc.download("tmp.geojson", format="GeoJSON")

does not result in GeoJSON, but our ad-hoc json format

{"2023-09-02T00:00:00Z":[[2821.200693756194,2609.406342913776,2399.656095143707,2181.0545094152626,2508.486124876115,3052.7879088206146,3255.195242814668,3276.349851337958,3324.8186323092173,7501.266105054509,1988.806739345887,1690.5792864222003,156.0,8.189791873141724,null,71.0812685827552,255.0,1.0,165.0,44.0,272.9603567888999,5.960356788899901,1.0]],"2023-09-04T00:00:00Z":[[421.3726461...

Seems the user provided format is completely ignored, even downloading with invalid format works:

vc.download("result-spatiotemporal.geojson", format="foobar")
# Same download format (JSON)

Spatial

First eliminating the temporal dimension with temporal mean gives other behavior:

vc = cube.reduce_temporal("mean").aggregate_spatial(geometry, reducer="mean")
vc.download("result-spatiotemporal.geojson", format="GeoJSON")

fails with

OpenEoApiError: [501] FeatureUnsupported: Unsupported output format geojson; supported are: JSON and CSV (ref: r-2312154de4f744cc9c0f0c6274062341)

Downloading with format JSON works:

vc = cube.reduce_temporal("mean").aggregate_spatial(geometry, reducer="mean")
vc.download("result-spatiotemporal.json", format="JSON")

works and just gives a table in JSON:

[[2418.45242814668]]

The other suggested alternative CSV failed:

vc = cube.reduce_temporal("mean").aggregate_spatial(geometry, reducer="mean")
vc.download("result-spatiotemporal.CSV", format="CSV")

with

OpenEoApiError: [500] Internal: Server error: TypeError("send_file() got an unexpected keyword argument 'mimetypes'") (ref: r-2312151309444b60b81545adb99fcc7b)

soxofaan commented 6 months ago

related to

jdries commented 6 months ago

Is CoverageJSON perhaps the solution?

jdries commented 5 months ago

we now support geoparquet, would be good to check if that solves this issue?

soxofaan commented 3 months ago

possibly related/duplicate: #201

JeroenVerstraelen commented 3 months ago

Geojson is implemented. Throw error for netCDF. Look into json output because that still uses the old structure.

VincentVerelst commented 2 months ago

This is currently a blocking issue for https://github.com/Open-EO/openeo-gfmap/issues/87

jdries commented 2 months ago

@VincentVerelst I just did a couple of experiments, also requesting csv and parquet, and I always just get the expected result. Can you give a code sample that doesn't work?

VincentVerelst commented 2 months ago

@jdries , for example j-240425ffc0c34991836f4dd8d8fb520a:

c = openeo.connect('openeo.vito.be').authenticate_oidc()

s2 = c.load_collection(
    collection_id=collection,
    bands=bands,
    temporal_extent=temporal_extent
).reduce_dimension(dimension='t', reducer='mean')

features = s2.aggregate_spatial(geometries=geom, reducer='mean')

job = features.execute_batch(
    title='Point feature extraction',
    out_format='GeoParquet'
)

gives _FeatureUnsupportedException(statuscode=501, code='FeatureUnsupported', message='Unsupported output format GeoParquet; supported are: JSON, CSV and GeoParquet', id='no-request')

geom is a FeatureCollection of Points.

jdries commented 2 months ago

@VincentVerelst can you try specifying it as 'Parquet', that is the official name: https://openeo.vito.be/openeo/1.0/file_formats I'll see where this error message comes from... EDIT: pushed a fix for that error message Official overview of supported formats with details: https://documentation.dataspace.copernicus.eu/APIs/openEO/File_formats.html

VincentVerelst commented 2 months ago

Confirmed out_format='Parquet' works!