elyra-ai / elyra

Elyra extends JupyterLab with an AI centric approach.
https://elyra.readthedocs.io/en/stable/
Apache License 2.0
1.86k stars 344 forks source link

make generic pipelines to DAG rendering work with Airflow 2.x #3166

Open shalberd opened 1 year ago

shalberd commented 1 year ago

Describe the issue Currently, even when using generic pipeline editor with python files and notebooks, the rendered DAG is not compatible with Airflow 2..6.0 or higher, or any Airflow 2.x, for that matter :-) Broken DAG: [/opt/airflow/dags/repo/SvensWetterapprootpvc-0202162036.py] Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/opt/airflow/dags/repo/SvensWetterapprootpvc-0202162036.py", line 2, in <module> from airflow.contrib.kubernetes.volume_mount import VolumeMount ModuleNotFoundError: No module named 'airflow.contrib.kubernetes'

To Reproduce Steps to reproduce the behavior:

  1. Create a generic pipeline
  2. Submit it to the Airflow runtime with Airflow 2.x

Screenshots or log output If applicable, add screenshots or log output to help explain your problem.

Log Output
Paste the log output here.

Expected behavior DAg is rendered correctly for use in Airflow 2.x

Starting with jinja template https://github.com/elyra-ai/elyra/blob/main/elyra/templates/airflow/airflow_template.jinja2

and elyra/elyra/pipeline/airflow/processor_airflow.py

possible hints https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html

https://airflow.apache.org/docs/apache-airflow/2.5.3/howto/upgrading-from-1-10/index.html

https://airflow.apache.org/docs/apache-airflow/2.5.3/core-concepts/dags.html

https://github.com/apache/airflow/pull/21653

Deployment information Describe what you've deployed and how:

ianonavy commented 1 year ago

My team was able to get generic pipeline DAGs to work with Airflow 2.5.3 on this commit (https://github.com/kflow-ai/elyra/commit/f9d132954e008d30145f18794aa543d97f121a5f). We don't have unit tests working locally, so this commit isn't ready for a PR. Perhaps we can work with an active maintainer to get this in proper shape for an upstream merge?

Off the top of my head, we need automated tests to work and potentially test with both earlier and later versions of Airflow. My understanding from this comment is we may actually want to take a different approach to keep Airflow 1 support at the same time.

screenshot of Airflow 2.5.3 showing a working pipeline generated from a dev build of Elyra

shalberd commented 1 year ago

Do we really need to test it with Airflow 1.x?

By the way, your code looks really good, I have almost exactly that in my local code base right now :-)

Two differences:

https://github.com/apache/airflow/pull/21653/files

days_ago will not be there anymore in Airflow 3.x, so import pendulum

instead of

from airflow.utils.dates import days_ago

and

schedule='@once'

instead of schedule_interval='@once'

and

start_date=pendulum.today('UTC').add(days=-1),

instead of

start_date=days_ago(1),

shalberd commented 1 year ago

@ianonavy yup, we will certainly work with @lresende I am also sporadically in contact with @thesuperzapper who is involved in both KF Pipelines and airflow helm charts.

shalberd commented 10 months ago

@ianonavy "Off the top of my head, we need automated tests to work and potentially test with both earlier and later versions of Airflow." we will deprecate airflow 1.x support with elyra 4.0 and I am thinking of adding more unit tests aside from the existing ones, too.

so far, I only had to modify one existing test in community elyra for it to work https://github.com/elyra-ai/elyra/pull/3167/files#diff-f4fbb5d86d69f450c9235421616ec18e28f22eb174fb97332ba1952026c0a9f6 Do you think more than that unit test change or additonal unit test adding is needed? The existing Elyra tests still all run though in PR 3167 and spcifically https://github.com/elyra-ai/elyra/actions/runs/7505710196/job/20435558496?pr=3167#step:6:506

as mentioned, we are assuming full switchover to airflow 2.x in Elyra 4.0. Discussed with and okayed by the maintainers.