meltano / airflow-ext

Meltano Airflow utility extension
Apache License 2.0
7 stars 0 forks source link

bug: Meltano stops working with Airflow 2.7.0 #45

Open sswander opened 1 year ago

sswander commented 1 year ago

Meltano Version

2.19.0

Python Version

3.10

Bug scope

Configuration (settings parsing, validation, etc.)

Operating System

macOS Ventura 13.5.1 (22G90)

Description

Airflow 2.6.3 had some vulnerability issues that was fixed in 2.7.0, so we attempted to upgrade Meltano to use this version by setting apache-airflow==2.7.0 in the pip_url

Afterwards, setting logging to debug and calling meltano install followed by meltano invoke airflow --help returns this error

DEBUG:root:Creating engine '<meltano.core.project.Project object at 0x10375f7c0>@postgresql://localhost/meltano_system'
DEBUG:meltano.core.project_plugins_service:{'plugin': 'airflow', 'parent': 'airflow', 'source': <DefinitionSource.LOCKFILE: 8>, 'event': 'Found plugin parent', 'level': 'debug', 'timestamp': '2023-08-31T04:42:11.773851Z'}
DEBUG:root:Invoking: ['/Users/<redacted>/.meltano/orchestrators/airflow/venv/bin/airflow', '--help']
DEBUG:root:Generated default '/Users/<redacted>/.meltano/run/airflow/airflow.cfg'
DEBUG:meltano.cli.utils:Need help fixing this problem? Visit http://melta.no/ for troubleshooting steps, or to
join our friendly Slack community.

[Errno 2] No such file or directory: '/Users/<redacted>/.meltano/run/airflow/airflow.cfg'

Debugging

https://github.com/meltano/meltano/blob/e4bdaedab02462e9e19a1bf063cbce26bc3c7581/src/meltano/core/plugin/airflow.py#L114-L116

This code seems to run airflow --help to generate airflow.cfg. Since 2.7.0, airflow --help no longer creates airflow.cfg file. This is an intended change and the previous behaviour was accidental. See https://github.com/apache/airflow/discussions/33951#discussioncomment-6878007

The suggested fix is to use airflow config list --defaults to intentionally create airflow.cfg https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

Code

No response

tayloramurphy commented 1 year ago

@sswander are you using Airflow as an orchestrator plugin or utility plugin (i.e. which key is it under in your meltano.yml)? We don't recommend using the orchestrator plugin type any more and the docs should all say to use it as a utility.

Either way, the bug is still a bug since the Airflow extension which runs as a utility has the same behavior.

https://github.com/meltano/airflow-ext/blob/main/airflow_ext/wrapper.py#L132

tayloramurphy commented 1 year ago

FYI @pnadolny13. When we upgrade the version of Airflow in this extension we'll need to make this fix.

sswander commented 1 year ago

@tayloramurphy thanks for checking this! We're using it as orchestrator in meltano.yml. TIL utility is recommended over orchestrator now; for my team's and my learning, do you mind pointing me to a doc or discussion that led to this?

And to change it to utility, is it a matter of simply changing the key in meltano.yml or is there any special care we have to take?

tayloramurphy commented 1 year ago

@sswander it was mainly to help with code maintainability as the orchestrator implementation was very tightly coupled with meltano which made iterating on it harder.

To migrate you would have to uninstall the orchestrator and install the utility. I think there would be a way to migrate it but I think @edgarrmondragon or @pnadolny13 would know better.

sswander commented 1 year ago

Thanks @tayloramurphy , this is valuable info 🙏