We are using airflow + mysql databse + rabbitmq celery broker.
Sqlalchemy connection pool is disabled.
After upgrading from Airflow 2.2.3 to 2.3.1 we are getting an error during Airflow web interface usage:
Python version: 3.7.13
Airflow version: 2.3.1
Node: %HOST%
-------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 1008, in __call__
return self.registry[key]
KeyError: <greenlet.greenlet object at 0x7f18c5080a10 (otid=0x7f18c62669b0) current active started main>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 2448, in wsgi_app
response = self.full_dispatch_request()
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 1953, in full_dispatch_request
return self.finalize_request(rv)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 1970, in finalize_request
response = self.process_response(response)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 2269, in process_response
self.session_interface.save_session(self, ctx.session, response)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/airflow/www/session.py", line 33, in save_session
return super().save_session(*args, **kwargs)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_session/sessions.py", line 554, in save_session
saved_session = self.sql_session_model.query.filter_by(
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 552, in __get__
return type.query_class(mapper, session=self.sa.session())
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/orm/scoping.py", line 129, in __call__
return self.registry()
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 1010, in __call__
return self.registry.setdefault(key, self.createfunc())
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 4058, in __call__
return self.class_(**local_kw)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 176, in __init__
bind = options.pop('bind', None) or db.engine
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1000, in engine
return self.get_engine()
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1019, in get_engine
return connector.get_engine()
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 596, in get_engine
self._engine = rv = self._sa.create_engine(sa_url, options)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1029, in create_engine
return sqlalchemy.create_engine(sa_url, **engine_opts)
File "<string>", line 2, in create_engine
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/deprecations.py", line 298, in warned
return fn(*args, **kwargs)
File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/engine/create.py", line 646, in create_engine
engineclass.__name__,
TypeError: Invalid argument(s) 'pool_size' sent to create_engine(), using configuration MySQLDialect_mysqldb/NullPool/Engine. Please check that the keyword arguments are appropriate for this combination of components.
After changing env AIRFLOWDATABASESQL_ALCHEMY_POOL_ENABLED (or airflow.cfg parameter sql_alchemy_pool_enabled in case of configs set not via envs) from False to True error becomes resolved.
What you think should happen instead
I think that we should still be able to disable sqlalchemy pooling if there is such an option. Besides, it works in Airflow 2.2.3. Somehow pool_size option gets passed to sqlaclhemy create_engine method, even when pooling is disabled via setting environment variable.
How to reproduce
Here our airflow.cfg (without sensitive info)
airflow.cfg.zip
Here is our pip freeze output
Mysql database, rabbitmq celery broker.
Deployment type doesn't matter for this error: it reproduces on fully deployed stage (Airflow components running in containers built from our own Dockerfile) and on developers laptop, with venv and airflow webserver command.
Anything else
After further investigation we've found that create_engine receives following options:
These pool_size and pool_recycle weren't set by us, so they must have come from some default values.
It seems than an error occurs during create_app function: airflow/www/app.py:71. And that pool_size parameter comes from apply_driver_hacks method of SQLAlchemy class: flask_sqlalchemy/__init__.py:937
Apache Airflow version
2.3.1 (latest released)
What happened
We are using airflow + mysql databse + rabbitmq celery broker. Sqlalchemy connection pool is disabled. After upgrading from Airflow 2.2.3 to 2.3.1 we are getting an error during Airflow web interface usage:
After changing env AIRFLOWDATABASESQL_ALCHEMY_POOL_ENABLED (or airflow.cfg parameter sql_alchemy_pool_enabled in case of configs set not via envs) from False to True error becomes resolved.
What you think should happen instead
I think that we should still be able to disable sqlalchemy pooling if there is such an option. Besides, it works in Airflow 2.2.3. Somehow pool_size option gets passed to sqlaclhemy create_engine method, even when pooling is disabled via setting environment variable.
How to reproduce
Here our airflow.cfg (without sensitive info) airflow.cfg.zip Here is our
pip freeze
outputTo reproduce:
airflow webserver
.To fix:
Operating System
Ubuntu 20.04.4 LTS
Versions of Apache Airflow Providers
Deployment
Other Docker-based deployment
Deployment details
Mysql database, rabbitmq celery broker. Deployment type doesn't matter for this error: it reproduces on fully deployed stage (Airflow components running in containers built from our own Dockerfile) and on developers laptop, with venv and
airflow webserver
command.Anything else
After further investigation we've found that create_engine receives following options:
These pool_size and pool_recycle weren't set by us, so they must have come from some default values. It seems than an error occurs during create_app function:
airflow/www/app.py:71
. And thatpool_size
parameter comes from apply_driver_hacks method of SQLAlchemy class:flask_sqlalchemy/__init__.py:937
Are you willing to submit PR?
Code of Conduct