Open-EO / openeo-processes

Interoperable processes for openEO's big Earth observation cloud processing.
https://processes.openeo.org
Apache License 2.0
49 stars 15 forks source link

Generalize `load_geojson` #415

Closed soxofaan closed 1 year ago

soxofaan commented 1 year ago

I considered raising this in #412 or #322 but maybe it's better to keep this a separate thread:

load_geojson (as introduced in #412) is closely tied to GeoJSON (by name and supported format), which has the following problems:

Couple of lines of thought:

m-mohr commented 1 year ago

I'm fine with load_external.

I have some issues with load_inline though as it is problematic to make nice UIs for it. Right now I can easily detect in the Web Editor if a parameter is geojson and then I can render a map for it. If we have load_inline with a file format parameter I can only guess from the file format name whether something is GeoJSON or not, same applies for other formats. That makes the implementation much more difficult, especially when the parameter order is in a way that you specify the data first and the format second. load_geojson makes the use of this (at least in the UI) much more simple.

In general, we run into issues again with the fact that we don't define file formats in openEO. The "nice" behavior of load_geojson with e.g. the properties parameter is then not defined centrally but defined by back-ends so that there is usually no common behavior for importing. There shouldn't be because we can't do that for all file format and it should be open to other implementations, on the other hand it feels nice to have for simple formats such as GeoJSON that are just used so often and in so many central places.

jdries commented 1 year ago

A reminder perhaps, but another issue with a specific process like load_geojson is in UDP's where the data to load is a parameter, and the UDP designer would like a user friendly approach to allow loading vector data from different sources without having to make a complex construction inside the UDP. Could potentially be solved by simply having a vector cube as a udp parameter but then we do have more reliance on the ability to construct vector cubes loading logic client side.

m-mohr commented 1 year ago

@jdries Yes, a vector cube should probably be passed in. geojson as subtype has be deprecated (except for load_geojson, if accepted).

Other than that, I propose explicit processes if we can inline the data (e.g. load_geojson, load_covjson) and then processes for loading from other sources (e.g. load_uploaded_files for the user workspace and load_http for loading from external HTTP(S) URLs).