Open shalberd opened 1 year ago
So it would be bad if the pipeline editor and runtime support for Airflow were removed.
This statement should be clarified to show that runtime support for Airflow was not removed from the Elyra package in the ODH Elyra notebook images that are built and supported as part of ODH Core. We only restrict the Elyra PipelinesProcessor to kfp
(Data Science Pipelines) since that is what ODH supports.
There is no official support for Airflow in ODH as the integration is currently an ODH Contrib component (https://github.com/opendatahub-io-contrib/airflow-on-openshift) with no guarantee that the deployment works.
At least allow for optionally enabling it via Configmap or ENV variable, based on this
Since ODH does not officially support Airflow, you can still build and import a custom notebook image into ODH Dashboard that has Airflow pipelines processor enabled. Based on the offline comment by @harshad16, you can build an Elyra notebook image with the Airflow pipelines processor by modifying jupyter_elyra_config.py and building the notebook
If the Elyra Airflow notebook image works with the deployment of airflow-on-openshift in odh-contrib then you could submit a PR for review to odh-contrib/workbench-images
@LaVLaS @harshad16 Airflow itself is no problem, I for now started talking to the Red Hat folks on what makes it run as a whole successfully (never use the mucked up postgres image that comes with it, use a decent way of running postgres like crunchy postgres via OLM) and some more.
https://github.com/opendatahub-io-contrib/airflow-on-openshift/issues/7#issuecomment-1599585068
Airflow has been made an optional tier-2 part of ODH in summer of 2022.
https://github.com/opendatahub-io-contrib/airflow-on-openshift
Recently, Elyra became a part of ODH via overlay. Even more recently, Elyra itself has been taken over by RedHat (from IBM).
https://github.com/opendatahub-io/notebooks/pull/58#issuecomment-1562378131
Since ODH has a top-tier focus on Kubeflow Pipelines, ODH wants to focus on Kubeflow Pipelines only in Elyra.
Elyra has for a long time had Airflow support in all sorts of ways
Airflow-specific operators
https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a
Generic pipelines
https://medium.com/ibm-data-ai/automate-your-machine-learning-workflow-tasks-using-elyra-and-apache-airflow-adf297adc455
, though Airflow 2.x support is still lacking, but will come, some tweaks needed for e.g. generic pipeline to DAG rendering, libraries have changed :-)
So it would be bad if the pipeline editor and runtime support for Airflow were removed. At least allow for optionally enabling it via Configmap or ENV variable, based on this
Background:
We plan to use both: data science pipelines / Kubeflow Pipelines for pure ML development and Airflow for more of an ETL / data engineering set of tasks.