Closed em-pe closed 3 years ago
We have the following options:
Environment is a group option so it stays. Image is already in the config. Output seems to be good candidate as it's tied to the environment in the first place. Cron-expression is specific but it seems to be tied to the project. dag_name is by default derived from package_name - what do you think @em-pe ? Also we have experiment_name (that's internal but related to the dag_name) that's derived from package_name - should we put it into config as well? wait_for_completion is a runtime option.
Here are my suggestions: experiment_name -> put in config, default to package name image -> put in config, allow override through optional console param output -> put in config, allow to override through console param cron expression -> put in config, default to @daily dag_name -> put in config, default to package name, allow to override through console param wait for completion -> put in config, allow override through console param
Regarding defaults, have a look at templated config loader - we could add env variable that we populate with package_name
, and instead of custom default handling we just initialise project with config file that explicitly configures all the defaults.
experiment_name: {package_name}
cron expression -> put in config, default to @daily
@em-pe We should use it only for schedule,
or also for upload-pipeline
? Assuming run-once
should ignore this config parameter.
I am a bit reluctant to put wait_for_completion
to config, as it would be hard to force case when you don't want plugin to wait (since parameter takes int describing timeout rather than bool flag).
@michalzelechowski-getindata let's use it for both, we may then get rid of the schedule
command I believe. Regarding wait for completion, let's leave it as CLI option now. In kedro-kubeflow the behavior is slightly different as there is an option to submit pipeline for execution and be done without waiting for it to complete.
@em-pe I think it makes sense for the schedule
to allow for override from CLI, in contrary to upload-pipeline
so I suggest we keep it separate for better flexibility and alignment with kedro-kubeflow.
@michalzelechowski-getindata sounds good to me
There are some configuration options required by commands that are tied to the environment. To make the work of the plugin easier, an allow storing some of the configuration in VCS we could move them to config file (
airflow-k8s.yml
).To make the configuration more robust and dynamic we could reuse TemplatedConfigLoader hook. Check how we do it in kedro-kubeflow-example for reference.
References from kedro-kubeflow:
Before implementing we may have a short discussion which options do we move to config and which of them we allow to override with console call.