Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

version for process graphs #480

Closed jdries closed 1 year ago

jdries commented 1 year ago

As part of reproducibility & traceability, users are asking me how to record the version of the process graph itself in the output metadata. We already record backend version and the process graph itself, but for production projects, our users also attach versions to the workflow itself. I see two locations where this can be relevant:

My initial proposal would be to have a top-level property 'version' next to the others like title and description.

m-mohr commented 1 year ago

Just to be clear, what version do you mean (and what does it include)?

  1. The revision of the process graph with its nodes and parameters itself (how is it determined? Is each change of an attribute in the Web Editor for example a change? is it only set each time it's sent to the server?)
  2. Version of the processes and/or API?
  3. The number of the run of the batch job? (e.g. if you run, update, run?)
  4. Something else?
jdries commented 1 year ago

It's closest to option 1, but basically would be defined by the user who creates the job. The main problem is that small changes to the parameter in a process graph are not necessarily version changes. For instance, a graph with the same version can be applied to different spatial extents. This interpretation is something that can only be done correctly by the author of the process graph.

m-mohr commented 1 year ago

Aha, I see.

Some thoughts:

What do you think?

Related issue: https://github.com/Open-EO/openeo-api/issues/472

soxofaan commented 1 year ago

as already mentioned, I had same kind of question in #472 about adding extra PG metadata (e.g. client software version)

m-mohr commented 1 year ago

Closing in favor of #472