Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

Case insensitive handling of runtime names #409

Open soxofaan opened 3 years ago

soxofaan commented 3 years ago

https://github.com/Open-EO/openeo-api/blob/f303d65a3291d4cd74dacc0e796803bb5d6fa03b/openapi.yaml#L1627-L1635

Like #371 we should enforce handling UDF runtime names in case insensitive manner, I think

m-mohr commented 3 years ago

Unfortunately, I don't think we can enforce this in openEO API v1. It seems we did not consider case-insensitivity for UDF runtimes and now we can't easily add it without breaking things. It was different for the other entities (e.g. file formats) where we only improved the wording. This would also need a change in openeo-processes (e.g. run_udf).

soxofaan commented 3 years ago

For v1 we could already add a recommendation for case insensitive handling, which is not breaking and still allows alignment before v2

m-mohr commented 3 years ago

Hmm, I think I don't really agree with that due to the reasons below:

As for billing plans and web services, the API allows CI and also defines the places where the names can be used in CI manner.

For file formats (and as requested here in UDF runtimes), we can just give a general direction in the API and say that CI is allowed (or recommended). The API itself doesn't use these names in a CI manner. This is then (from a user POV) only really implemented in the processes (save_result allows CI/run_udf doesn't mention CI right now, i.e. is CS), but there it would read weird (and break things) in the user-facing documentation if it would say "UDF runtime name is allowed to be given in CI manner". So the only place where we could recommend this are the API and the implementation notes for processes. User could not be informed about it so I'm not sure how useful this recommendation would be and it could break interoperability (in v1) a bit by having process graphs that run on one back-end and not on another one.

soxofaan commented 3 years ago

I was mainly thinking about a CI recommendation or requirement for back-end implementers. I don't think it's very useful for a regular end user, indeed.

it could break interoperability (in v1) a bit by having process graphs that run on one back-end and not on another one.

We already have an interoperability problem: if one backend calls its Python runtime Python and another backend uses python, your can not use the same process graph. By recommending CI handling you actually improve interoperability issues in this case. The CI recommendation would just be a reasonable safety net to cover subtle differences between back-ends (and documentation, cookbooks, ...)

Another solution is adding a guideline for the runtime names ("lower case only" or "must be Camel Case", ...) or explicitly list some examples for popular runtimes (Python, R, ...)

m-mohr commented 3 years ago

Indeed. We should probably aim for a recommended naming, maybe best proposed by the UDF runtime implementation.

I've expanded the OpenAPI with a general recommendation for now:

Each runtime environment has a unique name, which is used as the property key. The name is used in processes to select the runtime environment for UDFs, so the names should be stable and meaningful. It is RECOMMENDED to use the following naming and casing:

  • For programming langauge environments use the names as provided in in the Scriptol List of Programming Languages.
  • For docker images use the docker image identifier excluding the registry path.

Which would result in Python and R for the runtimes we support right now.

Leaving this open for consideration in API v2.