Open-EO / openeo-python-client

Python client API for OpenEO

https://open-eo.github.io/openeo-python-client/

Apache License 2.0

149 stars 38 forks source link

Refactor graph building and management for (rest) client #117

Closed soxofaan closed 4 years ago

soxofaan commented 4 years ago

There have been various issues/bugs related to how the process graph is built/managed in the python client. For example, but (probably) not only: #107 #102 #92

@jdries and I already discussed an approach to improve the situation. Quoting #102 :

as already discussed with @jdries : a proper, general fix for this (and possibly related issues) requires a bit of rethinking of how process graphs are built and serialized in the client. For example: instead of "flattening" the process graph already at build time, it would make things probably less error prone to work with straightforward graph-like data structures during build time and only do the openEO-specific serialization and flattening just before doing the openEO API requests.

This ticket is to track/label the refactor required to solve these issues

soxofaan commented 4 years ago

internal VITO ticket: EP-3251

bgoesswe commented 4 years ago

So, if I understood it correctly, it means something like: The "DataCube" (Graph) object contains "Process" (Node) objects and these contain the "from_node" property (Edge) and just at the moment, the user wants to send it to the backend it generates the process graph. All of this in the background so without changing how the user works atm? I think I like that approach, also it would be easier to debug the graph for users without them having to see the actual JSON process graph because they could use their Python debugger to look into the Process objects or just iterate over them.

soxofaan commented 4 years ago

There are two kinds of progress graphs in play now:

"flattened" style, which follows the JSON representation of the openEO API: a non-nested mapping, containing processes that reference each other through mapping keys
nested/graph style: process nodes just point directly to each other

The 0.4-style implementation only used the flattened style and did the flattening on the fly while building the whole process graph. This on-the-fly flattening causes various challenges and makes things quite hard to get right (duplication, reuse, ...).

The DataCube implementatin uses the nested style internally, which is a lot easier to work with, reason about, debug, etc. Only at the point of sending the process graph to the backend, this structure is flattened according to the openEO API.

soxofaan commented 4 years ago

126 was merged in master

lforesta commented 4 years ago

Does this fix issue #92 ? I still see multiple arrayelement nodes when doing band math like in the phenology example. I have pulled changes from master a few days ago.

soxofaan commented 4 years ago

hi @lforesta , it fixes #92 against a API v1.0 backend.

When using a backend below API 1.0, the client uses the original ImageCollectionClient class, which is largely untouched (to not break too much). With a backend using API 1.0 or higher, the client uses a new class DataCube is used instead, which has a much different approach in building the process graphs, including a fix for #92

lforesta commented 4 years ago

@soxofaan thanks for the clarification!