airflow-helm / charts

The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
https://github.com/airflow-helm/charts/tree/main/charts/airflow
Apache License 2.0
634 stars 474 forks source link

Airflow 2.3.0 Support #572

Closed c-thiel closed 2 years ago

c-thiel commented 2 years ago

Checks

Motivation

The Chart is currently not compatible with Airflow 2.3.0 due to changes in the Database Configuration: https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html#database-configuration-moved-to-new-section-22284

Implementation

No response

Are you willing & able to help?

c-thiel commented 2 years ago

When updating the image to 2.3.0 I get the following Error during CheckDB:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/__init__.py", line 35, in <module>
    from airflow import settings
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/settings.py", line 35, in <module>
    from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf  # NOQA F401
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/configuration.py", line 1345, in <module>
    conf.validate()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/configuration.py", line 293, in validate
    self._validate_config_dependencies()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/configuration.py", line 387, in _validate_config_dependencies
    raise AirflowConfigException(f"error: cannot use sqlite with the {self.get('core', 'executor')}")
airflow.exceptions.AirflowConfigException: error: cannot use sqlite with the KubernetesExecutor

Actually, acording to https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html#database-configuration-moved-to-new-section-22284, it should just throw a deprecation warning and not fail. Not sure what I am missing.

c-thiel commented 2 years ago

I traced the broken backward compatibility over to Airflow and created an Issue: https://github.com/apache/airflow/issues/23408

Nevertheless, we should use AIRFLOWDATABASESQL_ALCHEMY_CONN_CMD instead of AIRFLOWCORESQL_ALCHEMY_CONN_CMD.

c-thiel commented 2 years ago

Until this is fixed, you can use the following workaround:

airflow:
  config:
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_CMD: "bash -c 'eval \"$DATABASE_SQLALCHEMY_CMD\"'"
martinrw commented 2 years ago

Thanks for this @c-thiel - your suggestion helped. My application is now up and running version 2.3.0 using the helm chart and seems to be working fine but I have one pod called "sync users" that keeps restarting.

It appears to be trying to run this script: /mnt/scripts/sync_users.py

and eventually fails with this error: sqlalchemy.exc.InvalidRequestError: Multiple classes found for path "Permission" in the registry of this declarative base. Please use a fully module-qualified path.

So maybe there is something more that also needs to be changed? I am using Oauth for user logins so maybe that's why I'm not affected by this

Linux-oiD commented 2 years ago

Could you also add support for the separate DagProcessorManager deployment that was introduced in 2.3.0? https://airflow.apache.org/blog/airflow-2.3.0/#dagprocessormanager-as-standalone-process-aip-43 Or should I create a separate issue for that?

denysivanov commented 2 years ago

Really need this one!

kvin007 commented 2 years ago

Looking forward to this feature

OlexandrRudenko commented 2 years ago

It will be great

mr-wolf-rsh commented 2 years ago

stuck with this too, hoping for this feature to be released soon

Zakolesnik commented 2 years ago

Waiting for this feature

mmeza09 commented 2 years ago

Waiting for this feature

PamelaSofiaCastillo commented 2 years ago

i hope this will be released soon

JoHermoza commented 2 years ago

Hoping for this feature

CarrascoEr commented 2 years ago

Waiting for this feature

diego-santamaria commented 2 years ago

This will be really helpful.

rociofc0312 commented 2 years ago

Looking forward to this feature

chrisalo97 commented 2 years ago

Will be great to have this feature

Alexsander00 commented 2 years ago

Waiting for this feature too

GianFNoguni commented 2 years ago

Waiting for this feature!

emilymitacc commented 2 years ago

Waiting to start to use this feature

gmontero06 commented 2 years ago

Waiting for the release of this feature 🙏

LuisEspinoza212 commented 2 years ago

Really need this one

thesuperzapper commented 2 years ago

Hey all,

I plan to release chart version 8.6.1 with a fix for the main issues preventing Airflow 2.3 from working, allowing people to choose an Airflow 2.3.0 image if they want.

However, I think we should wait until chart version 8.7.0 before making Airflow 2.3.X the default version. This will give us time to implement support for Airflow 2.3 specific features, and wait for airflow itself to find and patch any critical problems with 2.3.0 before encouraging half the world to update.

PS: If anyone knows of any other Airflow 2.3 features, we currently don't support, please comment them here.

Valenzione commented 2 years ago

@thesuperzapper That's great, thank you for your work! Do you have any time estimates in mind for the 2.3.0 support release date?

thesuperzapper commented 2 years ago

@thesuperzapper That's great, thank you for your work! Do you have any time estimates in mind for the 2.3.0 support release date?

@Valenzione I expect to release 8.6.1 in the next few days, which will allow using airflow 2.3.0 images but not make it default.

Track progress on the 8.6.1 milestone. (NOTE: I haven't added the 2.3.0 issue fixes to that milestone yet)


Regarding the 8.7.0 release, I want at least one large feature (in addition to making 2.3.0 the default) so that people have a reason to update.

Right now, my hope is to finish the new task-aware celery autoscaler which allows safe and configurable up/downscaling of CeleryExecutor workers (including selectively removing the inactive worker Pods when downscaling). The prototype is coming along, but I haven't had much time to put the final polish on yet.


PS: the reason 8.6.1 is not quite ready yet was that in the last week I focused on improving the docs, the best change is that we now provide sample starting points of custom-values.yaml for each executor type CeleryExecutor, KubernetesExecutor, and CeleryKubernetesExecutor.

But I also made large updates to Quickstart Guide, How to manage airflow connections?, How to load DAG definitions?, How to persist airflow logs?.

thesuperzapper commented 2 years ago

Hey All, I am very sorry about the delay, I have been quite bogged down this week and last (Airflow Summit and the like).

However, we now have resolutions for the main problems caused by 2.3.0, so should be ready to cut 8.6.1:

If anyone knows of other issues from 2.3+ not listed above please comment ASAP, so we can get them fixed in 8.6.1!

jurovee commented 2 years ago

@thesuperzapper I think we should give it a go, let's test that in our environments and provide a feedback if necessary.

Disclaimer: We struggle a little bit with one of the bugs in Airflow 2.2.* which should be fixed in 2.3.1 hopefully, so no pressure! 😄😄

karakanb commented 2 years ago

@juroVee not sure if it helps, but I have managed to upgrade from v2.2.5 to v2.3.2 with the chart version v8.6.0 without any changes in my values file. The breaking changes that were introduced in v2.3.0 seem to be fixed in the subsequent versions, so the upgrade itself works fine from this chart's perspective.

jurovee commented 2 years ago

@karakanb good to know! :clap: Thanks for the info. I think that 8.6.1 is almost ready and it should contain a fix for one of my issues with logs and external volumeMounts, so I think I'll just wait to have everything together + 2.3.2 in a single package :thumbsup:

thesuperzapper commented 2 years ago

@karakanb @juroVee there is still one issue with the sync-users script in airflow 2.3.2 which will require the 8.6.1 release (as they actually moved some of the imports).

NOTE: this is fixed by https://github.com/airflow-helm/charts/pull/592

thesuperzapper commented 2 years ago

Hey all, the long-awaited 8.6.1 is out now with support for Airflow 2.3.0!

I would love it if you tested it out, and shared feedback!

martinrw commented 2 years ago

Just gave it a test... Set my dockerfile to use apache/airflow:2.3.2-python3.9 as the base and helm chart version 8.6.1 it works perfectly. Thanks !!!