Open ssYkse opened 4 years ago
Does the database run in the container "airflow-main-service"?
Issue solved?
Hi,
I have also observed the same issue in my setup recently. I am runnning Airflow 10.1.4 with MariaDB as database and few of the task gets failed with following error
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip /usr/local/lib/python3.6/site-packages/airflow/config_templates/airflow_local_settings.py:65: DeprecationWarning: The elasticsearch_host option in [elasticsearch] has been renamed to host - the old setting has been used, but please update your config.
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip ELASTICSEARCH_HOST = conf.get('elasticsearch', 'HOST')
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip /usr/local/lib/python3.6/site-packages/airflow/config_templates/airflow_local_settings.py:67: DeprecationWarning: The elasticsearch_log_id_template option in [elasticsearch] has been renamed to log_id_template - the old setting has been used, but please update your config.
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip ELASTICSEARCH_LOG_ID_TEMPLATE = conf.get('elasticsearch', 'LOG_ID_TEMPLATE')
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip /usr/local/lib/python3.6/site-packages/airflow/config_templates/airflow_local_settings.py:69: DeprecationWarning: The elasticsearch_end_of_log_mark option in [elasticsearch] has been renamed to end_of_log_mark - the old setting has been used, but please update your config.
[2020-09-20 14:46:09,719] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip ELASTICSEARCH_END_OF_LOG_MARK = conf.get('elasticsearch', 'END_OF_LOG_MARK')
[2020-09-20 14:46:10,119] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip [2020-09-20 14:46:10,119] {settings.py:213} INFO - settings.configure_orm(): Using pool settings. pool_size=50, max_overflow=10, pool_recycle=5400, pid=94919
[2020-09-20 14:46:15,218] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip [2020-09-20 14:46:15,217] {init.py:51} INFO - Using executor LocalExecutor
[2020-09-20 14:46:16,627] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip [2020-09-20 14:46:16,626] {dagbag.py:90} INFO - Filling up the DagBag from /home/orbdviz/datavisualization/workspace/dags/nc_pipe.py
[2020-09-20 14:48:24,891] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip Traceback (most recent call last):
[2020-09-20 14:48:24,891] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2285, in _wrap_pool_connect
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return fn()
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 363, in connect
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return _ConnectionFairy._checkout(self)
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 773, in _checkout
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip fairy = _ConnectionRecord.checkout(pool)
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 492, in checkout
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip rec = pool._do_get()
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py", line 238, in _do_get
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return self._create_connection()
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 308, in _create_connection
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return _ConnectionRecord(self)
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 437, in init
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip self.connect(first_connect_check=True)
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 657, in connect
[2020-09-20 14:48:24,892] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip pool.logger.debug("Error on connect(): %s", e)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 69, in exit
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip exc_value, with_traceback=exc_tb,
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tablesip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 178, in raise
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip raise exception
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 652, in connect
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip connection = pool._invoke_creator(self)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return dialect.connect(*cargs, cparams)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 488, in connect
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return self.dbapi.connect(*cargs, *cparams)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/MySQLdb/init.py", line 85, in Connect
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip return Connection(args, kwargs)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 208, in init
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip super(Connection, self).init(*args, kwargs2)
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip _mysql_exceptions.OperationalError: (2006, "Can't connect to MySQL server on 'ctc2hz1-02-s40.uhc.com' (115)")
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip The above exception was the direct cause of the following exception:
[2020-09-20 14:48:24,893] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip
[2020-09-20 14:48:24,894] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip Traceback (most recent call last):
[2020-09-20 14:48:24,894] {base_task_runner.py:115} INFO - Job 2087: Subtask load_raw_tables_ip File "/usr/local/bin/airflow", line 32, in
Strange part is I checked in my database to see if there is any max connection issue but the number of connection is also very low that airflow has opened on database. When re run the jobs, it start working fine after 2-3 retries. Could you please guide how we should proceed on this.
Thanks
Hi, I am running 3 containers on kubernetes.
1) Postgresql 2) Airflow scheduler with gitsync sidecar 3) Airflow webserver with gisync sidecar.
Everything works like a charm - for about 30seconds to 3 minutes. Then I always get the same error:
Same error on both scheduler/webserver. What could the problem be? I checked the postgres contaienr, no issue there. I used a portforward on connected to postgres for a few hours - never had any hangup/disconnect. I opened a shell in the scheduler container and connected to postgres container by running a quick python interactive session
(I forget the exact details)
and let this run for 30 minutes - no problems.
But both airflow services keep losing their connection.