Open bgoesswe opened 5 years ago
This is an old issue, but it's becoming more important as time passes since back-ends have more functionality now. And it seems to me also non-custom processes are missing in the client, e.g. linear_scale_range was not there #96 , does the current client need a one-to-one mapping of the processes defined in openEO?
I think there are 2 separate aspects to this:
Concerning 1.: I'm not a big of dynamic generation of functions/methods as this breaks some features that are important for the end user: normal discovery and documentation of methods (by looking at the source code of ImageCollectionClient, or using the code inspection features of their IDE), straightforward exceptions and backtraces when something goes wrong, lower barrier to entry to contribute/fix things, ....
An alternative solution for 1. is still using traditional hardcoded methods but using unit tests that compare the openEO API description with the available methods of ImageCollectionClient and fail when something is missing. We are using this approach in the python driver to support dedicated python exceptions for each openEO error code (as defined in https://open-eo.github.io/openeo-api/errors/):
That being said, there are probably some ways to reduce the necessary boilerplate code and overhead to implement a process as a method in the client.
About 2.: this should be relatively straightforward to implement. It should be optional at the moment however, because probably not all backends properly declare which processes are supported in the capabilities endpoint (the VITO backend doesn't for example)
There exists a way to add custom/unsupported processes: https://open-eo.github.io/openeo-python-client/#openeo.rest.imagecollectionclient.ImageCollectionClient.graph_add_process
Perhaps we need to improve documentation so that people find it more easily?
About dynamically generated processes, I agree with Stefaan. I would however not object to someone showing how this can be done in the python client (as a separate way of building process graphs, separate from the ImageCollection class).
I'll investigate on possible dynamic generation strategies in the "process_generation" branch.
So I worked on that issue now for a while, and haven't found a working suitable solution to dynamically generating processes using the Python client other than doing it myself. Therefore, in the branch "process_generation", I created a Python tool to generate a Python file given a backend URL (e.g. see here for EURAC). The "ProcessParser" can be used eighter in a python client script or as a command-line call with arguments. How to use the generated processes can be seen in this example. The advantage is that you can use the static defined processes of the Python client with the generated processes available from the backend within the same program, so you do not have to choose the strategy of using the Python client. A disadvantage is that you are relying on the documentation provided by the backend. I was also thinking about doing it in a more Object Oriented way by generating a new class that inherits from "ImageCollectionClient" with additional generated methods, but there I ran (atm) into the issue of methods with the same name and I am not sure if this is a good way to go anyways.
Interesting work, can you create a pull request? that might help with further fine tuning and discussion
Some points to consider from today's discussion:
Issues static:
Issues dynamic:
In my opinion, the basic functionality (e.g. load_collection, filters) should be static for convenience reasons. Other things should be dynamically generated (e.g. custom processes). The main issue is to decide where to draw the line on what should be static or not.
Inspired by yesterday's discussion I also played a bit with the following idea: add a property .dynamic
to ImageCollection objects that delegates all function calls to corresponding dynamically detected process. The full pull request (WIP) is at #118
The basic unit test shows how it is intended to work:
make_larger
that takes a raster cube and float as parameters:
{
"id": "make_larger",
"description": "multiply a raster cube with a factor",
"parameters": [
{"name": "data", "schema": {"type": "object", "subtype": "raster-cube"}},
{"name": "factor", "schema": {"type": "float"}},
]}
dynamic
property as follows
cube = session040.load_collection("SENTINEL2")
cube = cube.dynamic.make_larger(factor=42)
Some notes:
.dynamic
is the best I could come up with for now, if someone has a better idea: please let me know.dynamic
you can clearly separate "static" predefined methods and dynamically detected processes. Obviously, it allows to have a predefined convenience function hardcoded in the client and a custom process in a backend with the same nameself
of the ImageCollection instance will be bound). However, to support all kinds of processes we could also define a comparable .dynamic
property on the Connection object@soxofaan
- by using a property
.dynamic
you can clearly separate "static" predefined methods and dynamically detected processes. Obviously, it allows to have a predefined convenience function hardcoded in the client and a custom process in a backend with the same name
In general I like this, but I don't think a user cares about whether something is dynamic or hard-coded. That should be completely hidden at best and simply be cube.make_larger(factor=42)
.
- the name
.dynamic
is the best I could come up with for now, if someone has a better idea: please let me know
If we need such a "prefix": custom? proprietary?
From the API perspective there are in general two kind of processes:
If required, you could split predefined into two categories
Since the back ends may be capable of a different amount of processes and they can be retrieved by the GET /processes end point, it would be a major improvement to generate the process functions dynamically when a back end provider is chosen.
e.g.: https://stackoverflow.com/questions/23812760/dynamic-functions-creation-from-json-python
It is at least something I want to look into.