Open thesuperzapper opened 2 years ago
Related: https://github.com/elyra-ai/elyra/issues/1843 (support global environment variables)
@ptitzler this is similar to https://github.com/elyra-ai/elyra/issues/1843, but I think this proposal is easier and more useful, as it allows actual integration of generic Notebooks/Scripts into pipeline flows.
Right now, Notebooks/Scripts kind of sit on their own, as they are inherently hardcoded, and cant update their behavior based on the outputs of the upstream.
@akchinSTC @ptitzler I really think we should prioritize this feature, as currently "notebooks" can't really be integrated into Kubeflow/Airflow pipelines in Elyra (without hard coding things like file paths).
Environment variables are probably the easiest way to "parameterize" a notebook, as a cell can contain:
import os
my_input = os.environ["MY_INPUT"]
If we let people set environment variables from upstream outputs, you could do things like chaining an "s3 download" node into a "train ML" Notebook, and parameterize the notebook with an "INPUT_DATA_PATH" environment variable that provides the path of the downloaded s3 data.
Not sure how relevant still is, but I had started to work on adding some support for parameters at https://github.com/lresende/elyra/commit/2a4b6328159ba17fbf1a16d2cd4c57a17a9b8c17
Background
After PR https://github.com/elyra-ai/elyra/pull/2350, users can now consume the
outputPath
's from parent nodes asinputValues
, but this feature can't be used by the generic Notebook/Script nodes.I think the most generic way of passing these outputs to the notebook is using environment variables.
UI Implementation
We would add something similar to the
"Please select an output from a parent: "
but to the environment variable setter.Argo Implementation
We can set the
env
field from the parent's Argo parameters using{{steps.STEP_NAME.outputs.parameters.PARAM_NAME}}
:NOTE: this example actually passes an "input" using the method proposed in https://github.com/elyra-ai/elyra/issues/2471, rather than an "output"