Open tiborsimko opened 3 years ago
P.S. In the above, we might read workspace_root_path
or whatever name we shall select :wink: in al the places.
IMO option 1 looks cleaner. I think the workspace is relevant enough to have its own section.
Currently, we support some input.options
but those are very related to certain workflow languages whereas the workspace would be universal. OTOH, it's true that the CACHE
option is directly related to the storage.. but still, I think it'd be harder for the final user to set it as an option.
Now that we have an option to use several different POSIX workspaces where to run workflows, the users should be able to configure where they would like to run their given workflow. E.g. one workflow in the default place, another workflow in their EOS home, etc.
This configuration should be done in
reana.yaml
.Option 1: introduce new top-level section
We can introduce a new section in
reana.yaml
to express the concept of workspace. Pros: instead of just writing the POSIX path, we could store more information there, should we need it in the future. Also, the concept of workspace will stand out clearly. Cons: we would need to amend parsing and REST API protocols due to having new section.An example of how this could look like:
A future option could be:
Option 2: use existing
options
clauseWe have an option of not changing
reana.yaml
and simply use existing clauses, such as parameters or options. Parameters, such as temperature=20c and mass=10g, influence the research results, whilst options, such as cache=off, keep the physics results and only influence how the workflow is orchestrated. From this point of view, a choice of workspace is more an option than a parameter, since a good reproducible analysis should not depend on where it is run. Hence we could chooseoptions
. Pros: we only add some parameter, REST API could use existing vehicle. Cons: conceptually the notion of workspace would not stand out so clearly, the workspace configuration would be "hidden" amongst other options. Also, options can be set via CLI options (e.g.reana-client start -o foo=bar
) but this cannot be done for workspace, since it must be initialised before.Example:
A future option could be:
(The type is inferred from the beginning of the value. Or, if need be, more strings would be added, such as
workspace_type: s3
. This is basically "flattened" option 1 expressed viaoptions
clause.)Notes
Regardless of which option we shall choose, there is a certain default that should be used in case the user does not set anything. This default will be set by the cluster administrator, but this will be part of another issue.