apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.73k stars 14.22k forks source link

Scheduler Deadlock in tests for MySQL 5.7 #11543

Closed potiuk closed 3 years ago

potiuk commented 4 years ago

We have the first example of scheduler deadlock in the willd for MySQL for the new scheduler:

https://github.com/apache/airflow/pull/8550/checks?check_run_id=1258193958#step:6:1439

Would be great to track it down and fix before the prod release:

NOTE For now the issue has been mitigated (according to the documentation of MySQL) by applying restarts (with exponential back-off) in #12046. This should help with beta-testing of Airlfow by the users and to see if the retries on deadlock has some potential negative effects (according to the documentation - they might have if they happen frequently enough).

Some more analysis might be performed between now and 2.0.0 which might result in deciding that this issue does not need any more fixes than the current mitigation or some actual changes in the code that might result in preventing the deadlocks.

potiuk commented 4 years ago

Stacktrace for permanent storage:

=================================== FAILURES ===================================
______ TestSchedulerJob.test_scheduler_verify_pool_full_2_slots_per_task _______

self = <sqlalchemy.engine.base.Connection object at 0x7f4a588ae100>
dialect = <sqlalchemy.dialects.mysql.mysqldb.MySQLDialect_mysqldb object at 0x7f4a57fc78b0>
constructor = <bound method DefaultExecutionContext._init_compiled of <class 'sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb'>>
statement = 'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired..._dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
parameters = ('test_scheduler_verify_pool_full_2_slots_per_task', None, 0, 0, 1, None, ...)
args = (<sqlalchemy.dialects.mysql.mysqldb.MySQLCompiler_mysqldb object at 0x7f4a58988490>, [{'concurrency': 16, 'dag_id': 'test_scheduler_verify_pool_full_2_slots_per_task', 'default_view': 'tree', 'description': None, ...}])
conn = <sqlalchemy.pool.base._ConnectionFairy object at 0x7f4a589cd400>
context = <sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb object at 0x7f4a588ae9d0>

    def _execute_context(
        self, dialect, constructor, statement, parameters, *args
    ):
        """Create an :class:`.ExecutionContext` and execute, returning
        a :class:`_engine.ResultProxy`."""

        try:
            try:
                conn = self.__connection
            except AttributeError:
                # escape "except AttributeError" before revalidating
                # to prevent misleading stacktraces in Py3K
                conn = None
            if conn is None:
                conn = self._revalidate_connection()

            context = constructor(dialect, self, conn, *args)
        except BaseException as e:
            self._handle_dbapi_exception(
                e, util.text_type(statement), parameters, None, None
            )

        if context.compiled:
            context.pre_exec()

        cursor, statement, parameters = (
            context.cursor,
            context.statement,
            context.parameters,
        )

        if not context.executemany:
            parameters = parameters[0]

        if self._has_events or self.engine._has_events:
            for fn in self.dispatch.before_cursor_execute:
                statement, parameters = fn(
                    self,
                    cursor,
                    statement,
                    parameters,
                    context,
                    context.executemany,
                )

        if self._echo:
            self.engine.logger.info(statement)
            if not self.engine.hide_parameters:
                self.engine.logger.info(
                    "%r",
                    sql_util._repr_params(
                        parameters, batches=10, ismulti=context.executemany
                    ),
                )
            else:
                self.engine.logger.info(
                    "[SQL parameters hidden due to hide_parameters=True]"
                )

        evt_handled = False
        try:
            if context.executemany:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_executemany:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_executemany(
                        cursor, statement, parameters, context
                    )
            elif not parameters and context.no_parameters:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute_no_params:
                        if fn(cursor, statement, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_execute_no_params(
                        cursor, statement, context
                    )
            else:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
>                   self.dialect.do_execute(
                        cursor, statement, parameters, context
                    )

/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1277: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sqlalchemy.dialects.mysql.mysqldb.MySQLDialect_mysqldb object at 0x7f4a57fc78b0>
cursor = <MySQLdb.cursors.Cursor object at 0x7f4a588ae2e0>
statement = 'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired..._dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
parameters = ('test_scheduler_verify_pool_full_2_slots_per_task', None, 0, 0, 1, None, ...)
context = <sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb object at 0x7f4a588ae9d0>

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)

/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py:593: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f4a588ae2e0>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'
args = ("'test_scheduler_verify_pool_full_2_slots_per_task'", 'NULL', '0', '0', '1', 'NULL', ...)

    def execute(self, query, args=None):
        """Execute a query.

        query -- string, query to execute on server
        args -- optional sequence or mapping, parameters to use with query.

        Note: If args is a sequence, then %s must be used as the
        parameter placeholder in the query. If a mapping is used,
        %(key)s must be used as the placeholder.

        Returns integer represents rows affected, if any
        """
        while self.nextset():
            pass
        db = self._get_db()

        # NOTE:
        # Python 2: query should be bytes when executing %.
        # All unicode in args should be encoded to bytes on Python 2.
        # Python 3: query should be str (unicode) when executing %.
        # All bytes in args should be decoded with ascii and surrogateescape on Python 3.
        # db.literal(obj) always returns str.

        if PY2 and isinstance(query, unicode):
            query = query.encode(db.encoding)

        if args is not None:
            if isinstance(args, dict):
                args = dict((key, db.literal(item)) for key, item in args.items())
            else:
                args = tuple(map(db.literal, args))
            if not PY2 and isinstance(query, (bytes, bytearray)):
                query = query.decode(db.encoding)
            try:
                query = query % args
            except TypeError as m:
                self.errorhandler(self, ProgrammingError, str(m))

        if isinstance(query, unicode):
            query = query.encode(db.encoding, 'surrogateescape')

        res = None
        try:
            res = self._query(query)
        except Exception:
            exc, value = sys.exc_info()[:2]
>           self.errorhandler(self, exc, value)

/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:255: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

errorclass = <class '_mysql_exceptions.OperationalError'>
errorvalue = OperationalError(1213, 'Deadlock found when trying to get lock; try restarting transaction')

    def defaulterrorhandler(connection, cursor, errorclass, errorvalue):
        """
        If cursor is not None, (errorclass, errorvalue) is appended to
        cursor.messages; otherwise it is appended to
        connection.messages. Then errorclass is raised with errorvalue as
        the value.

        You can override this with your own error handler by assigning it
        to the instance.
        """
        error = errorclass, errorvalue
        if cursor:
            cursor.messages.append(error)
        else:
            connection.messages.append(error)
        del cursor
        del connection
        if isinstance(errorvalue, BaseException):
>           raise errorvalue

/usr/local/lib/python3.8/site-packages/MySQLdb/connections.py:50: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f4a588ae2e0>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'
args = ("'test_scheduler_verify_pool_full_2_slots_per_task'", 'NULL', '0', '0', '1', 'NULL', ...)

    def execute(self, query, args=None):
        """Execute a query.

        query -- string, query to execute on server
        args -- optional sequence or mapping, parameters to use with query.

        Note: If args is a sequence, then %s must be used as the
        parameter placeholder in the query. If a mapping is used,
        %(key)s must be used as the placeholder.

        Returns integer represents rows affected, if any
        """
        while self.nextset():
            pass
        db = self._get_db()

        # NOTE:
        # Python 2: query should be bytes when executing %.
        # All unicode in args should be encoded to bytes on Python 2.
        # Python 3: query should be str (unicode) when executing %.
        # All bytes in args should be decoded with ascii and surrogateescape on Python 3.
        # db.literal(obj) always returns str.

        if PY2 and isinstance(query, unicode):
            query = query.encode(db.encoding)

        if args is not None:
            if isinstance(args, dict):
                args = dict((key, db.literal(item)) for key, item in args.items())
            else:
                args = tuple(map(db.literal, args))
            if not PY2 and isinstance(query, (bytes, bytearray)):
                query = query.decode(db.encoding)
            try:
                query = query % args
            except TypeError as m:
                self.errorhandler(self, ProgrammingError, str(m))

        if isinstance(query, unicode):
            query = query.encode(db.encoding, 'surrogateescape')

        res = None
        try:
>           res = self._query(query)

/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:252: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f4a588ae2e0>
q = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def _query(self, q):
        db = self._get_db()
        self._result = None
>       db.query(q)

/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:378: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_mysql.connection open to 'mysql' at 55ec2a735e70>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def query(self, query):
        # Since _mysql releases GIL while querying, we need immutable buffer.
        if isinstance(query, bytearray):
            query = bytes(query)
        if self.waiter is not None:
            self.send_query(query)
            self.waiter(self.fileno())
            self.read_query_result()
        else:
>           _mysql.connection.query(self, query)
E           _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

/usr/local/lib/python3.8/site-packages/MySQLdb/connections.py:280: OperationalError

The above exception was the direct cause of the following exception:

self = <tests.jobs.test_scheduler_job.TestSchedulerJob testMethod=test_scheduler_verify_pool_full_2_slots_per_task>

    def test_scheduler_verify_pool_full_2_slots_per_task(self):
        """
        Test task instances not queued when pool is full.

        Variation with non-default pool_slots
        """
        dag = DAG(
            dag_id='test_scheduler_verify_pool_full_2_slots_per_task',
            start_date=DEFAULT_DATE)

        BashOperator(
            task_id='dummy',
            dag=dag,
            owner='airflow',
            pool='test_scheduler_verify_pool_full_2_slots_per_task',
            pool_slots=2,
            bash_command='echo hi',
        )

        dagbag = DagBag(dag_folder=os.path.join(settings.DAGS_FOLDER, "no_dags.py"),
                        include_examples=False,
                        read_dags_from_db=True)
        dagbag.bag_dag(dag=dag, root_dag=dag)
>       dagbag.sync_to_db()

tests/jobs/test_scheduler_job.py:2576: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
airflow/utils/session.py:65: in wrapper
    return func(*args, **kwargs)
airflow/models/dagbag.py:535: in sync_to_db
    DAG.bulk_write_to_db(self.dags.values(), session=session)
airflow/utils/session.py:61: in wrapper
    return func(*args, **kwargs)
airflow/models/dag.py:1791: in bulk_write_to_db
    session.flush()
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py:2523: in flush
    self._flush(objects)
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py:2664: in _flush
    transaction.rollback(_capture_exception=True)
/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py:68: in __exit__
    compat.raise_(
/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py:178: in raise_
    raise exception
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py:2624: in _flush
    flush_context.execute()
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py:422: in execute
    rec.execute(self)
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py:586: in execute
    persistence.save_obj(
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py:239: in save_obj
    _emit_insert_statements(
/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py:1083: in _emit_insert_statements
    c = cached_connections[connection].execute(statement, multiparams)
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1014: in execute
    return meth(self, multiparams, params)
/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py:298: in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1127: in _execute_clauseelement
    ret = self._execute_context(
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1317: in _execute_context
    self._handle_dbapi_exception(
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1511: in _handle_dbapi_exception
    util.raise_(
/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py:178: in raise_
    raise exception
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py:1277: in _execute_context
    self.dialect.do_execute(
/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py:593: in do_execute
    cursor.execute(statement, parameters)
/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:255: in execute
    self.errorhandler(self, exc, value)
/usr/local/lib/python3.8/site-packages/MySQLdb/connections.py:50: in defaulterrorhandler
    raise errorvalue
/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:252: in execute
    res = self._query(query)
/usr/local/lib/python3.8/site-packages/MySQLdb/cursors.py:378: in _query
    db.query(q)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_mysql.connection open to 'mysql' at 55ec2a735e70>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def query(self, query):
        # Since _mysql releases GIL while querying, we need immutable buffer.
        if isinstance(query, bytearray):
            query = bytes(query)
        if self.waiter is not None:
            self.send_query(query)
            self.waiter(self.fileno())
            self.read_query_result()
        else:
>           _mysql.connection.query(self, query)
E           sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
E           [SQL: INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)]
E           [parameters: ('test_scheduler_verify_pool_full_2_slots_per_task', None, 0, 0, 1, None, None, None, None, None, '/opt/airflow/tests/jobs/test_scheduler_job.py', 'airflow', None, 'tree', '{"type": "timedelta", "attrs": {"days": 1, "seconds": 0, "microseconds": 0}}', 16, 0, datetime.datetime(2016, 1, 1, 0, 0), datetime.datetime(2016, 1, 2, 0, 0))]
E           (Background on this error at: http://sqlalche.me/e/13/e3q8)

/usr/local/lib/python3.8/site-packages/MySQLdb/connections.py:280: OperationalError
----------------------------- Captured stdout call -----------------------------
[2020-10-15 09:35:59,178] {dag.py:1708} INFO - Sync 1 DAGs
[2020-10-15 09:35:59,185] {dag.py:1727} INFO - Creating ORM DAG for test_scheduler_verify_pool_full_2_slots_per_task
[2020-10-15 09:35:59,192] {dag.py:2151} INFO - Setting next_dagrun for test_scheduler_verify_pool_full_2_slots_per_task to 2016-01-01 00:00:00+00:00
------------------------------ Captured log call -------------------------------
INFO     airflow.models.dag:dag.py:1708 Sync 1 DAGs
INFO     airflow.models.dag:dag.py:1727 Creating ORM DAG for test_scheduler_verify_pool_full_2_slots_per_task
INFO     airflow.models.dag:dag.py:2151 Setting next_dagrun for test_scheduler_verify_pool_full_2_slots_per_task to 2016-01-01 00:00:00+00:00
--------------------------- Captured stderr teardown ---------------------------
Process DagFileProcessor0-Process:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/airflow/airflow/jobs/scheduler_job.py", line 186, in _run_file_processor
    result_channel.send(result)
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
potiuk commented 4 years ago

cc: @ashb @kaxil - happy to help with analysing and solving it next week if needed

kaxil commented 4 years ago

oh yea, we retry this code 3 times in the actual code: https://github.com/apache/airflow/blob/3447b55ba57a06c3820d1f754835e7d7f9a1fc68/airflow/jobs/scheduler_job.py#L686-L701 -- maybe we can use the same re-try logic in the test too

potiuk commented 4 years ago

I'd love to get to the bottom of the problem. Retrying is really a workaround for the real problem in those cases, I believe.

ashb commented 4 years ago

I would love a better fix - we did this as it's what mysql recommend in their own docs!

potiuk commented 4 years ago

We should leave it for now but I willl change the mysql server configuration in tests and dump some logs so that when there is a deadlock next time we will know exactly where the deadlock came from - maybe then it should be easier to find out what needs to be fixed.

I think the advise in the docs is the "if you do not know how to fix it in the architecture of your DB/Queries you can use it as a workaround" type of advice.

potiuk commented 4 years ago

PR #11614 with dumping logs from all containers and setting mysql to print all the deadlock information in the logs

kaxil commented 4 years ago

Always be prepared to re-issue a transaction if it fails due to deadlock. Deadlocks are not dangerous. Just try again.

https://dev.mysql.com/doc/refman/8.0/en/innodb-deadlocks-handling.html

potiuk commented 4 years ago

Surely we can leave with retrying (likely). It really depends how frequent the deadlocks will be happening in real-live scenarios (this is exactly what the MySQL documentation says if you read a bit earlier:

but they are not dangerous unless they are so frequent that you cannot run certain transactions at all.

If we accept the deadlocks then under heavy stress the retry scenario might even lead to into the situation of continuous deadlock and either no possibility to do any transaction or very slow rate (been there, done that).

But If we can track and fix the problem, this might be a better solution. And this is exactly what we can try to do in the months to come. And it also follows the advice from the same doc (this is exactly what I did in PR #11614).

If frequent deadlock warnings cause concern, collect more extensive debugging information by enabling the innodb_print_all_deadlocks configuration option. Information about each deadlock, not just the latest one, is recorded in the MySQL error log. Disable this option when you are finished debugging.

Since it is happening in unit tests from time to time, I think we have a perfect opportunity to track this down BEFORE some of our users have to cope with it in production environment. I think it's worth trying.

potiuk commented 4 years ago

For me the most important thing is that we do not know how often it will happen in the wild, and we have a cheap way of tracking it down, so why not trying?

ashb commented 4 years ago

The ones I've seen have only been when inserting a new route in to the DAG table which is a fairly rare event, and that's the only place we have put the retry in (dagbag.sync_to_db)

potiuk commented 4 years ago

Still I think it is nice to track it down when it happens, so I think enabling deadlock reporting in CI + uploading the logs as artifacts + not retrying in unit tests (#11614) is the best way forward don't you think ? It costs us an occasional (very rare) hopefully case where we will have deadlock in uni tests + all the means to verify and if someone has time maybe even track down the root cause and fix it.

potiuk commented 3 years ago

We have the first Deadlock with more information (on my PR): https://github.com/apache/airflow/pull/11659/checks?check_run_id=1286626048

docker logs docker-compose_mysql_1
Warning: 1T12:13:45.327151Z 0 [Warning] InnoDB: New log files created, LSN=45790
Warning: 1T12:13:45.529998Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
Warning: 1T12:13:45.536107Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: ddfdefd0-1396-11eb-b63d-0242ac120002.
Warning: 1T12:13:45.536994Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
Warning: 1T12:13:46.395368Z 0 [Warning] CA certificate ca.pem is self signed.
Warning: 1T12:13:46.676996Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
Warning: 1T12:13:48.763110Z 0 [Warning] Insecure configuration for --secure-file-priv: Data directory is accessible through --secure-file-priv. Consider choosing a different directory.
Warning: 1T12:13:48.763169Z 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2020-10-21T12:13:48.763199Z 0 [Note] mysqld (mysqld 5.7.32) starting as process 78 ...
2020-10-21T12:13:48.766314Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-10-21T12:13:48.766333Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-10-21T12:13:48.766337Z 0 [Note] InnoDB: Uses event mutexes
2020-10-21T12:13:48.766340Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2020-10-21T12:13:48.766342Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-10-21T12:13:48.766345Z 0 [Note] InnoDB: Using Linux native AIO
2020-10-21T12:13:48.766602Z 0 [Note] InnoDB: Number of pools: 1
2020-10-21T12:13:48.766721Z 0 [Note] InnoDB: Using CPU crc32 instructions
2020-10-21T12:13:48.768615Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-21T12:13:48.777325Z 0 [Note] InnoDB: Completed initialization of buffer pool
2020-10-21T12:13:48.780120Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-10-21T12:13:48.792424Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2020-10-21T12:13:48.799564Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-10-21T12:13:48.799614Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-10-21T12:13:48.912728Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-10-21T12:13:48.913728Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2020-10-21T12:13:48.913751Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2020-10-21T12:13:48.914971Z 0 [Note] InnoDB: 5.7.32 started; log sequence number 2747371
2020-10-21T12:13:48.915829Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2020-10-21T12:13:48.916133Z 0 [Note] Plugin 'FEDERATED' is disabled.
2020-10-21T12:13:48.917666Z 0 [Note] InnoDB: Buffer pool(s) load completed at 201021 12:13:48
2020-10-21T12:13:48.922794Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2020-10-21T12:13:48.922812Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
Warning: 1T12:13:48.923346Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-21T12:13:48.923378Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
Warning: 1T12:13:48.924897Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2020-10-21T12:13:48.932039Z 0 [Note] Event Scheduler: Loaded 0 events
2020-10-21T12:13:48.932212Z 0 [Note] mysqld: ready for connections.
Version: '5.7.32'  socket: '/var/run/mysqld/mysqld.sock'  port: 0  MySQL Community Server (GPL)
Warning: Unable to load '/usr/share/zoneinfo/iso3166.tab' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/zone.tab' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/zone1970.tab' as time zone. Skipping it.
2020-10-21T12:13:51.958028Z 0 [Note] Giving 0 client threads a chance to die gracefully
2020-10-21T12:13:51.958054Z 0 [Note] Shutting down slave threads
2020-10-21T12:13:51.958058Z 0 [Note] Forcefully disconnecting 0 remaining clients
2020-10-21T12:13:51.958063Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2020-10-21T12:13:51.958104Z 0 [Note] Binlog end
2020-10-21T12:13:51.958564Z 0 [Note] Shutting down plugin 'ngram'
2020-10-21T12:13:51.958577Z 0 [Note] Shutting down plugin 'partition'
2020-10-21T12:13:51.958580Z 0 [Note] Shutting down plugin 'BLACKHOLE'
2020-10-21T12:13:51.958583Z 0 [Note] Shutting down plugin 'ARCHIVE'
2020-10-21T12:13:51.958585Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2020-10-21T12:13:51.958612Z 0 [Note] Shutting down plugin 'MRG_MYISAM'
2020-10-21T12:13:51.958620Z 0 [Note] Shutting down plugin 'MyISAM'
2020-10-21T12:13:51.958626Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL'
2020-10-21T12:13:51.958628Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2020-10-21T12:13:51.958631Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2020-10-21T12:13:51.958633Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2020-10-21T12:13:51.958636Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2020-10-21T12:13:51.958638Z 0 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2020-10-21T12:13:51.958641Z 0 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2020-10-21T12:13:51.958643Z 0 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2020-10-21T12:13:51.958645Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2020-10-21T12:13:51.958648Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2020-10-21T12:13:51.958650Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2020-10-21T12:13:51.958652Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2020-10-21T12:13:51.958654Z 0 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2020-10-21T12:13:51.958656Z 0 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2020-10-21T12:13:51.958659Z 0 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2020-10-21T12:13:51.958661Z 0 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2020-10-21T12:13:51.958663Z 0 [Note] Shutting down plugin 'INNODB_METRICS'
2020-10-21T12:13:51.958665Z 0 [Note] Shutting down plugin 'INNODB_TEMP_TABLE_INFO'
2020-10-21T12:13:51.958667Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2020-10-21T12:13:51.958670Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2020-10-21T12:13:51.958672Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2020-10-21T12:13:51.958674Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2020-10-21T12:13:51.958677Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2020-10-21T12:13:51.958679Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2020-10-21T12:13:51.958681Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM'
2020-10-21T12:13:51.958683Z 0 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2020-10-21T12:13:51.958685Z 0 [Note] Shutting down plugin 'INNODB_CMP'
2020-10-21T12:13:51.958687Z 0 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2020-10-21T12:13:51.958690Z 0 [Note] Shutting down plugin 'INNODB_LOCKS'
2020-10-21T12:13:51.958692Z 0 [Note] Shutting down plugin 'INNODB_TRX'
2020-10-21T12:13:51.958694Z 0 [Note] Shutting down plugin 'InnoDB'
2020-10-21T12:13:51.958909Z 0 [Note] InnoDB: FTS optimize thread exiting.
2020-10-21T12:13:51.959115Z 0 [Note] InnoDB: Starting shutdown...
2020-10-21T12:13:52.059511Z 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2020-10-21T12:13:52.059907Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 201021 12:13:52
2020-10-21T12:13:53.769699Z 0 [Note] InnoDB: Shutdown completed; log sequence number 12605548
2020-10-21T12:13:53.771313Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2020-10-21T12:13:53.771338Z 0 [Note] Shutting down plugin 'MEMORY'
2020-10-21T12:13:53.771343Z 0 [Note] Shutting down plugin 'CSV'
2020-10-21T12:13:53.771347Z 0 [Note] Shutting down plugin 'sha256_password'
2020-10-21T12:13:53.771349Z 0 [Note] Shutting down plugin 'mysql_native_password'
2020-10-21T12:13:53.771498Z 0 [Note] Shutting down plugin 'binlog'
2020-10-21T12:13:53.772004Z 0 [Note] mysqld: Shutdown complete

Warning: 1T12:13:54.152755Z 0 [Warning] Insecure configuration for --secure-file-priv: Data directory is accessible through --secure-file-priv. Consider choosing a different directory.
Warning: 1T12:13:54.152813Z 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2020-10-21T12:13:54.152842Z 0 [Note] mysqld (mysqld 5.7.32) starting as process 1 ...
2020-10-21T12:13:54.155987Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-10-21T12:13:54.156007Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-10-21T12:13:54.156010Z 0 [Note] InnoDB: Uses event mutexes
2020-10-21T12:13:54.156013Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2020-10-21T12:13:54.156016Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-10-21T12:13:54.156019Z 0 [Note] InnoDB: Using Linux native AIO
2020-10-21T12:13:54.156282Z 0 [Note] InnoDB: Number of pools: 1
2020-10-21T12:13:54.156408Z 0 [Note] InnoDB: Using CPU crc32 instructions
2020-10-21T12:13:54.158290Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-21T12:13:54.167097Z 0 [Note] InnoDB: Completed initialization of buffer pool
2020-10-21T12:13:54.169910Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-10-21T12:13:54.182337Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2020-10-21T12:13:54.189985Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-10-21T12:13:54.190154Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-10-21T12:13:54.319985Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-10-21T12:13:54.321025Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2020-10-21T12:13:54.321052Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2020-10-21T12:13:54.322371Z 0 [Note] InnoDB: 5.7.32 started; log sequence number 12605548
2020-10-21T12:13:54.322754Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2020-10-21T12:13:54.323136Z 0 [Note] Plugin 'FEDERATED' is disabled.
2020-10-21T12:13:54.327481Z 0 [Note] InnoDB: Buffer pool(s) load completed at 201021 12:13:54
2020-10-21T12:13:54.330575Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2020-10-21T12:13:54.330595Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
Warning: 1T12:13:54.331226Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-21T12:13:54.331275Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2020-10-21T12:13:54.331951Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2020-10-21T12:13:54.332068Z 0 [Note] IPv6 is available.
2020-10-21T12:13:54.332083Z 0 [Note]   - '::' resolves to '::';
2020-10-21T12:13:54.332113Z 0 [Note] Server socket created on IP: '::'.
Warning: 1T12:13:54.333196Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2020-10-21T12:13:54.341203Z 0 [Note] Event Scheduler: Loaded 0 events
2020-10-21T12:13:54.341532Z 0 [Note] mysqld: ready for connections.
Version: '5.7.32'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
2020-10-21T12:13:55.025965Z 2 [Note] Got an error reading communication packets
2020-10-21T12:15:16.775454Z 16 [Note] Aborted connection 16 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:15:31.502655Z 33 [Note] Aborted connection 33 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:15:32.112554Z 37 [Note] Aborted connection 37 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:18.613428Z 156 [Note] Aborted connection 156 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:19.993993Z 169 [Note] Aborted connection 169 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:23.997925Z 178 [Note] Aborted connection 178 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:26.428805Z 179 [Note] Aborted connection 179 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:28.856039Z 181 [Note] Aborted connection 181 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:37.234787Z 183 [Note] Aborted connection 183 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:45.177740Z 184 [Note] Aborted connection 184 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:47.611966Z 186 [Note] Aborted connection 186 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:50.024044Z 188 [Note] Aborted connection 188 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:52.451034Z 190 [Note] Aborted connection 190 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:54.937623Z 192 [Note] Aborted connection 192 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:57.348497Z 194 [Note] Aborted connection 194 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:17:59.753386Z 196 [Note] Aborted connection 196 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:00.877236Z 198 [Note] Aborted connection 198 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:03.465205Z 199 [Note] Aborted connection 199 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:05.882347Z 201 [Note] Aborted connection 201 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:08.328999Z 203 [Note] Aborted connection 203 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:09.446403Z 205 [Note] Aborted connection 205 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:10.727593Z 206 [Note] Aborted connection 206 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:11.971021Z 208 [Note] Aborted connection 208 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:13.517812Z 209 [Note] Aborted connection 209 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:16.500688Z 210 [Note] Aborted connection 210 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:19.352522Z 212 [Note] Aborted connection 212 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:21.279280Z 213 [Note] Aborted connection 213 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:18:21.692807Z 207 [Note] InnoDB: Transactions deadlock detected, dumping detailed information.
2020-10-21T12:18:21.692825Z 207 [Note] InnoDB: 
*** (1) TRANSACTION:

TRANSACTION 12600, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 214, OS thread handle 140147069556480, query id 33701 docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau 172.18.0.3 root update
INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES ('test_start_date_scheduling', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/dags/test_scheduler_dags.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2020-10-22 12:18:21.603795', '2020-10-23 12:18:21.603795'),('test_task_start_date_scheduling', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/dags/test_scheduler_dags.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2016-0
2020-10-21T12:18:21.692864Z 207 [Note] InnoDB: *** (1) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 12600 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-21T12:18:21.692925Z 207 [Note] InnoDB: *** (2) TRANSACTION:

TRANSACTION 12605, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 207, OS thread handle 140147069286144, query id 33704 docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau 172.18.0.3 root update
INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES ('test_verify_integrity_if_dag_changed', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/jobs/test_scheduler_job.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2016-01-01 00:00:00', '2016-01-02 00:00:00')
2020-10-21T12:18:21.692948Z 207 [Note] InnoDB: *** (2) HOLDS THE LOCK(S):

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 12605 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-21T12:18:21.692988Z 207 [Note] InnoDB: *** (2) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 12605 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-21T12:18:21.693028Z 207 [Note] InnoDB: *** WE ROLL BACK TRANSACTION (2)

2020-10-21T12:19:05.564894Z 280 [Note] Aborted connection 280 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:19:08.107753Z 283 [Note] Aborted connection 283 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:19:08.444137Z 285 [Note] Aborted connection 285 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:19:35.741444Z 291 [Note] Aborted connection 291 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
2020-10-21T12:19:35.741517Z 41 [Note] Aborted connection 41 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_c1e4b5c82d6d.docker-compose_defau' (Got an error reading communication packets)
                   Container docker-compose_mysql_1 logs dumped to /home/runner/work/airflow/airflow/files/container_logs_docker-compose_mysql_1_2020-10-21_319791128_tests-mysql.log
potiuk commented 3 years ago

@ashb @kaxil ^^ weekend to take a look :)

ashb commented 3 years ago

If I'm reading that trace right, the deadlock is caused by two processes trying to INSERT in to dag table at the same time.

Statement that died was inserting test_start_date_scheduling (TRANSACTION 12600)

Lock was held by a txn inserting test_verify_integrity_if_dag_changed (TRANSACTION 12605)

Given that it's not a select/lock query causing the lock, I think that retry this insert in case of deadlock might be the only option.

Also: Dear MySQL: that is not a DEAD lock. No love, DB theory people.

potiuk commented 3 years ago

I have not yet looked at this, but I will have a cloer look during the weekend.

From my past investigation It's not always straightforward

In the meantime, I also dump another case.

https://github.com/apache/airflow/runs/1294560665:

###########################################################################################
                   Dumping logs from docker-compose_mysql_1 container
###########################################################################################
Warning: 2T19:48:16.529889Z 0 [Warning] InnoDB: New log files created, LSN=45790
Warning: 2T19:48:16.776700Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
Warning: 2T19:48:16.832316Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 875d400c-149f-11eb-8b8e-0242ac120002.
Warning: 2T19:48:16.833133Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
Warning: 2T19:48:17.408204Z 0 [Warning] CA certificate ca.pem is self signed.
Warning: 2T19:48:17.659702Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
Warning: 2T19:48:19.915428Z 0 [Warning] Insecure configuration for --secure-file-priv: Data directory is accessible through --secure-file-priv. Consider choosing a different directory.
Warning: 2T19:48:19.915523Z 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2020-10-22T19:48:19.915566Z 0 [Note] mysqld (mysqld 5.7.32) starting as process 78 ...
2020-10-22T19:48:19.918410Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-10-22T19:48:19.918431Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-10-22T19:48:19.918433Z 0 [Note] InnoDB: Uses event mutexes
2020-10-22T19:48:19.918435Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2020-10-22T19:48:19.918437Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-10-22T19:48:19.918439Z 0 [Note] InnoDB: Using Linux native AIO
2020-10-22T19:48:19.918653Z 0 [Note] InnoDB: Number of pools: 1
2020-10-22T19:48:19.918755Z 0 [Note] InnoDB: Using CPU crc32 instructions
2020-10-22T19:48:19.920490Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-22T19:48:19.928634Z 0 [Note] InnoDB: Completed initialization of buffer pool
2020-10-22T19:48:19.930799Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-10-22T19:48:19.943073Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2020-10-22T19:48:19.950204Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-10-22T19:48:19.950327Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-10-22T19:48:20.086257Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-10-22T19:48:20.086711Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2020-10-22T19:48:20.086724Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2020-10-22T19:48:20.087569Z 0 [Note] InnoDB: Waiting for purge to start
2020-10-22T19:48:20.137709Z 0 [Note] InnoDB: 5.7.32 started; log sequence number 2746702
2020-10-22T19:48:20.137942Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2020-10-22T19:48:20.138059Z 0 [Note] Plugin 'FEDERATED' is disabled.
2020-10-22T19:48:20.139479Z 0 [Note] InnoDB: Buffer pool(s) load completed at 201022 19:48:20
2020-10-22T19:48:20.142935Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2020-10-22T19:48:20.142951Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
Warning: 2T19:48:20.143330Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-22T19:48:20.143361Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
Warning: 2T19:48:20.144650Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2020-10-22T19:48:20.151260Z 0 [Note] Event Scheduler: Loaded 0 events
2020-10-22T19:48:20.151411Z 0 [Note] mysqld: ready for connections.
Version: '5.7.32'  socket: '/var/run/mysqld/mysqld.sock'  port: 0  MySQL Community Server (GPL)
Warning: Unable to load '/usr/share/zoneinfo/iso3166.tab' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/zone.tab' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/zone1970.tab' as time zone. Skipping it.
2020-10-22T19:48:23.092691Z 0 [Note] Giving 0 client threads a chance to die gracefully
2020-10-22T19:48:23.092712Z 0 [Note] Shutting down slave threads
2020-10-22T19:48:23.092715Z 0 [Note] Forcefully disconnecting 0 remaining clients
2020-10-22T19:48:23.092719Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2020-10-22T19:48:23.092752Z 0 [Note] Binlog end
2020-10-22T19:48:23.093120Z 0 [Note] Shutting down plugin 'ngram'
2020-10-22T19:48:23.093132Z 0 [Note] Shutting down plugin 'partition'
2020-10-22T19:48:23.093134Z 0 [Note] Shutting down plugin 'BLACKHOLE'
2020-10-22T19:48:23.093137Z 0 [Note] Shutting down plugin 'ARCHIVE'
2020-10-22T19:48:23.093138Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2020-10-22T19:48:23.093162Z 0 [Note] Shutting down plugin 'MRG_MYISAM'
2020-10-22T19:48:23.093164Z 0 [Note] Shutting down plugin 'MyISAM'
2020-10-22T19:48:23.093169Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL'
2020-10-22T19:48:23.093171Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2020-10-22T19:48:23.093173Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2020-10-22T19:48:23.093175Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2020-10-22T19:48:23.093176Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2020-10-22T19:48:23.093178Z 0 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2020-10-22T19:48:23.093179Z 0 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2020-10-22T19:48:23.093181Z 0 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2020-10-22T19:48:23.093183Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2020-10-22T19:48:23.093184Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2020-10-22T19:48:23.093186Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2020-10-22T19:48:23.093187Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2020-10-22T19:48:23.093189Z 0 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2020-10-22T19:48:23.093190Z 0 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2020-10-22T19:48:23.093192Z 0 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2020-10-22T19:48:23.093194Z 0 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2020-10-22T19:48:23.093196Z 0 [Note] Shutting down plugin 'INNODB_METRICS'
2020-10-22T19:48:23.093198Z 0 [Note] Shutting down plugin 'INNODB_TEMP_TABLE_INFO'
2020-10-22T19:48:23.093200Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2020-10-22T19:48:23.093202Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2020-10-22T19:48:23.093203Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2020-10-22T19:48:23.093205Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2020-10-22T19:48:23.093207Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2020-10-22T19:48:23.093208Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2020-10-22T19:48:23.093210Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM'
2020-10-22T19:48:23.093212Z 0 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2020-10-22T19:48:23.093214Z 0 [Note] Shutting down plugin 'INNODB_CMP'
2020-10-22T19:48:23.093216Z 0 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2020-10-22T19:48:23.093217Z 0 [Note] Shutting down plugin 'INNODB_LOCKS'
2020-10-22T19:48:23.093219Z 0 [Note] Shutting down plugin 'INNODB_TRX'
2020-10-22T19:48:23.093221Z 0 [Note] Shutting down plugin 'InnoDB'
2020-10-22T19:48:23.093250Z 0 [Note] InnoDB: FTS optimize thread exiting.
2020-10-22T19:48:23.093358Z 0 [Note] InnoDB: Starting shutdown...
2020-10-22T19:48:23.193551Z 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2020-10-22T19:48:23.193829Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 201022 19:48:23
2020-10-22T19:48:24.901298Z 0 [Note] InnoDB: Shutdown completed; log sequence number 12604863
2020-10-22T19:48:24.902537Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2020-10-22T19:48:24.902554Z 0 [Note] Shutting down plugin 'MEMORY'
2020-10-22T19:48:24.902559Z 0 [Note] Shutting down plugin 'CSV'
2020-10-22T19:48:24.902562Z 0 [Note] Shutting down plugin 'sha256_password'
2020-10-22T19:48:24.902564Z 0 [Note] Shutting down plugin 'mysql_native_password'
2020-10-22T19:48:24.902670Z 0 [Note] Shutting down plugin 'binlog'
2020-10-22T19:48:24.903053Z 0 [Note] mysqld: Shutdown complete

Warning: 2T19:48:25.273019Z 0 [Warning] Insecure configuration for --secure-file-priv: Data directory is accessible through --secure-file-priv. Consider choosing a different directory.
Warning: 2T19:48:25.273068Z 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2020-10-22T19:48:25.273087Z 0 [Note] mysqld (mysqld 5.7.32) starting as process 1 ...
2020-10-22T19:48:25.275477Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-10-22T19:48:25.275496Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-10-22T19:48:25.275499Z 0 [Note] InnoDB: Uses event mutexes
2020-10-22T19:48:25.275501Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2020-10-22T19:48:25.275503Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-10-22T19:48:25.275506Z 0 [Note] InnoDB: Using Linux native AIO
2020-10-22T19:48:25.275713Z 0 [Note] InnoDB: Number of pools: 1
2020-10-22T19:48:25.275815Z 0 [Note] InnoDB: Using CPU crc32 instructions
2020-10-22T19:48:25.277398Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-22T19:48:25.284153Z 0 [Note] InnoDB: Completed initialization of buffer pool
2020-10-22T19:48:25.286342Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-10-22T19:48:25.298559Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2020-10-22T19:48:25.304352Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2020-10-22T19:48:25.304498Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2020-10-22T19:48:25.429832Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2020-10-22T19:48:25.430265Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2020-10-22T19:48:25.430280Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2020-10-22T19:48:25.431083Z 0 [Note] InnoDB: Waiting for purge to start
2020-10-22T19:48:25.481219Z 0 [Note] InnoDB: 5.7.32 started; log sequence number 12604863
2020-10-22T19:48:25.481434Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2020-10-22T19:48:25.482205Z 0 [Note] Plugin 'FEDERATED' is disabled.
2020-10-22T19:48:25.484299Z 0 [Note] InnoDB: Buffer pool(s) load completed at 201022 19:48:25
2020-10-22T19:48:25.487438Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2020-10-22T19:48:25.487454Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
Warning: 2T19:48:25.487835Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-22T19:48:25.487866Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2020-10-22T19:48:25.488439Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2020-10-22T19:48:25.488949Z 0 [Note] IPv6 is available.
2020-10-22T19:48:25.489039Z 0 [Note]   - '::' resolves to '::';
2020-10-22T19:48:25.489146Z 0 [Note] Server socket created on IP: '::'.
Warning: 2T19:48:25.490394Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2020-10-22T19:48:25.496938Z 0 [Note] Event Scheduler: Loaded 0 events
2020-10-22T19:48:25.497197Z 0 [Note] mysqld: ready for connections.
Version: '5.7.32'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)
2020-10-22T19:48:25.562027Z 2 [Note] Got an error reading communication packets
2020-10-22T20:00:16.184941Z 921 [Note] Aborted connection 921 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:17.542383Z 935 [Note] Aborted connection 935 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:21.137430Z 945 [Note] Aborted connection 945 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:22.114092Z 946 [Note] Aborted connection 946 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:23.384136Z 948 [Note] Aborted connection 948 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:24.393428Z 949 [Note] Aborted connection 949 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:25.641684Z 951 [Note] Aborted connection 951 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:34.445536Z 952 [Note] Aborted connection 952 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:42.282983Z 954 [Note] Aborted connection 954 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:43.259746Z 956 [Note] Aborted connection 956 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:44.529289Z 958 [Note] Aborted connection 958 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:45.533847Z 959 [Note] Aborted connection 959 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:46.799395Z 961 [Note] Aborted connection 961 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:47.811452Z 962 [Note] Aborted connection 962 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:49.091139Z 964 [Note] Aborted connection 964 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:51.372533Z 965 [Note] Aborted connection 965 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:52.313924Z 967 [Note] Aborted connection 967 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:53.538350Z 969 [Note] Aborted connection 969 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:54.508874Z 970 [Note] Aborted connection 970 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:55.757651Z 972 [Note] Aborted connection 972 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:56.688589Z 973 [Note] Aborted connection 973 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:00:59.063765Z 974 [Note] Aborted connection 974 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:00.035349Z 976 [Note] Aborted connection 976 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:01.270706Z 978 [Note] Aborted connection 978 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:02.235232Z 979 [Note] Aborted connection 979 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:03.463372Z 981 [Note] Aborted connection 981 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:04.393960Z 982 [Note] Aborted connection 982 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:06.277514Z 983 [Note] Aborted connection 983 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:13.338895Z 990 [Note] Aborted connection 990 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:17.468557Z 995 [Note] Aborted connection 995 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:01:17.720433Z 985 [Note] InnoDB: Transactions deadlock detected, dumping detailed information.
2020-10-22T20:01:17.720451Z 985 [Note] InnoDB: 
*** (1) TRANSACTION:

TRANSACTION 21364, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 996, OS thread handle 140653469202176, query id 276580 docker-compose_airflow_run_f920c240a470.docker-compose_defau 172.18.0.3 root update
INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES ('test_task_start_date_scheduling', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/dags/test_scheduler_dags.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2016-01-01 00:00:00', '2016-01-02 00:00:00'),('test_start_date_scheduling', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/dags/test_scheduler_dags.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2020-10-23 20:01:17
2020-10-22T20:01:17.720580Z 985 [Note] InnoDB: *** (1) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 21364 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-22T20:01:17.720637Z 985 [Note] InnoDB: *** (2) TRANSACTION:

TRANSACTION 21365, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 985, OS thread handle 140653470013184, query id 276581 docker-compose_airflow_run_f920c240a470.docker-compose_defau 172.18.0.3 root update
INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES ('test_scheduler_verify_priority_and_slots', NULL, 0, 0, 1, NULL, NULL, NULL, NULL, NULL, '/opt/airflow/tests/jobs/test_scheduler_job.py', 'airflow', NULL, 'tree', '{\"type\": \"timedelta\", \"attrs\": {\"days\": 1, \"seconds\": 0, \"microseconds\": 0}}', 16, 0, '2016-01-01 00:00:00', '2016-01-02 00:00:00')
2020-10-22T20:01:17.720657Z 985 [Note] InnoDB: *** (2) HOLDS THE LOCK(S):

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 21365 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-22T20:01:17.720696Z 985 [Note] InnoDB: *** (2) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 278 page no 3 n bits 72 index PRIMARY of table `airflow`.`dag` trx id 21365 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

2020-10-22T20:01:17.720733Z 985 [Note] InnoDB: *** WE ROLL BACK TRANSACTION (2)

2020-10-22T20:07:01.587109Z 1242 [Note] Unknown database 'foobar'
2020-10-22T20:07:01.654077Z 1245 [Note] Unknown database 'foobar'
2020-10-22T20:08:01.519675Z 1416 [Note] Aborted connection 1416 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:08:01.758318Z 1425 [Note] Aborted connection 1425 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:08:10.341602Z 1430 [Note] Aborted connection 1430 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:08:12.954263Z 1433 [Note] Aborted connection 1433 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:08:14.179306Z 1435 [Note] Aborted connection 1435 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.691830Z 1506 [Note] Aborted connection 1506 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.691830Z 1481 [Note] Aborted connection 1481 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692096Z 1480 [Note] Aborted connection 1480 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692410Z 1474 [Note] Aborted connection 1474 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692492Z 1449 [Note] Aborted connection 1449 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692618Z 1473 [Note] Aborted connection 1473 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692764Z 1444 [Note] Aborted connection 1444 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692780Z 1472 [Note] Aborted connection 1472 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692967Z 1478 [Note] Aborted connection 1478 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693082Z 1446 [Note] Aborted connection 1446 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693098Z 1477 [Note] Aborted connection 1477 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693229Z 1476 [Note] Aborted connection 1476 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693323Z 1452 [Note] Aborted connection 1452 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693362Z 1475 [Note] Aborted connection 1475 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693514Z 1448 [Note] Aborted connection 1448 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693575Z 1441 [Note] Aborted connection 1441 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.693839Z 1447 [Note] Aborted connection 1447 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694051Z 1445 [Note] Aborted connection 1445 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694174Z 1453 [Note] Aborted connection 1453 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694210Z 1450 [Note] Aborted connection 1450 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694229Z 9 [Note] Aborted connection 9 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694249Z 8 [Note] Aborted connection 8 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694261Z 7 [Note] Aborted connection 7 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694287Z 1443 [Note] Aborted connection 1443 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.694304Z 1451 [Note] Aborted connection 1451 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.692413Z 1479 [Note] Aborted connection 1479 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
2020-10-22T20:16:24.696076Z 6 [Note] Aborted connection 6 to db: 'airflow' user: 'root' host: 'docker-compose_airflow_run_f920c240a470.docker-compose_defau' (Got an error reading communication packets)
                   Container docker-compose_mysql_1 logs dumped to /home/runner/work/airflow/airflow/files/container_logs_docker-compose_mysql_1_2020-10-22_322613028_tests-mysql.log
potiuk commented 3 years ago

Another case:

https://github.com/apache/airflow/pull/12017/checks?check_run_id=1338457193#step:6:818


____________ TestSchedulerJob.test_verify_integrity_if_dag_changed _____________

self = <sqlalchemy.engine.base.Connection object at 0x7f1fec948a90>
dialect = <sqlalchemy.dialects.mysql.mysqldb.MySQLDialect_mysqldb object at 0x7f200837d9e8>
constructor = <bound method DefaultExecutionContext._init_compiled of <class 'sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb'>>
statement = 'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired..._dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
parameters = ('test_verify_integrity_if_dag_changed', None, 0, 0, 1, None, ...)
args = (<sqlalchemy.dialects.mysql.mysqldb.MySQLCompiler_mysqldb object at 0x7f1fecfa9e80>, [{'concurrency': 16, 'dag_id': 'test_verify_integrity_if_dag_changed', 'default_view': 'tree', 'description': None, ...}])
conn = <sqlalchemy.pool.base._ConnectionFairy object at 0x7f1fec955ba8>
context = <sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb object at 0x7f1fec948160>

    def _execute_context(
        self, dialect, constructor, statement, parameters, *args
    ):
        """Create an :class:`.ExecutionContext` and execute, returning
        a :class:`_engine.ResultProxy`.

        """

        try:
            try:
                conn = self.__connection
            except AttributeError:
                # escape "except AttributeError" before revalidating
                # to prevent misleading stacktraces in Py3K
                conn = None
            if conn is None:
                conn = self._revalidate_connection()

            context = constructor(dialect, self, conn, *args)
        except BaseException as e:
            self._handle_dbapi_exception(
                e, util.text_type(statement), parameters, None, None
            )

        if context.compiled:
            context.pre_exec()

        cursor, statement, parameters = (
            context.cursor,
            context.statement,
            context.parameters,
        )

        if not context.executemany:
            parameters = parameters[0]

        if self._has_events or self.engine._has_events:
            for fn in self.dispatch.before_cursor_execute:
                statement, parameters = fn(
                    self,
                    cursor,
                    statement,
                    parameters,
                    context,
                    context.executemany,
                )

        if self._echo:
            self.engine.logger.info(statement)
            if not self.engine.hide_parameters:
                self.engine.logger.info(
                    "%r",
                    sql_util._repr_params(
                        parameters, batches=10, ismulti=context.executemany
                    ),
                )
            else:
                self.engine.logger.info(
                    "[SQL parameters hidden due to hide_parameters=True]"
                )

        evt_handled = False
        try:
            if context.executemany:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_executemany:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_executemany(
                        cursor, statement, parameters, context
                    )
            elif not parameters and context.no_parameters:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute_no_params:
                        if fn(cursor, statement, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_execute_no_params(
                        cursor, statement, context
                    )
            else:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_execute(
>                       cursor, statement, parameters, context
                    )

/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1277: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sqlalchemy.dialects.mysql.mysqldb.MySQLDialect_mysqldb object at 0x7f200837d9e8>
cursor = <MySQLdb.cursors.Cursor object at 0x7f1fec948630>
statement = 'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired..._dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
parameters = ('test_verify_integrity_if_dag_changed', None, 0, 0, 1, None, ...)
context = <sqlalchemy.dialects.mysql.mysqldb.MySQLExecutionContext_mysqldb object at 0x7f1fec948160>

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)

/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py:593: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f1fec948630>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'
args = ("'test_verify_integrity_if_dag_changed'", 'NULL', '0', '0', '1', 'NULL', ...)

    def execute(self, query, args=None):
        """Execute a query.

        query -- string, query to execute on server
        args -- optional sequence or mapping, parameters to use with query.

        Note: If args is a sequence, then %s must be used as the
        parameter placeholder in the query. If a mapping is used,
        %(key)s must be used as the placeholder.

        Returns integer represents rows affected, if any
        """
        while self.nextset():
            pass
        db = self._get_db()

        # NOTE:
        # Python 2: query should be bytes when executing %.
        # All unicode in args should be encoded to bytes on Python 2.
        # Python 3: query should be str (unicode) when executing %.
        # All bytes in args should be decoded with ascii and surrogateescape on Python 3.
        # db.literal(obj) always returns str.

        if PY2 and isinstance(query, unicode):
            query = query.encode(db.encoding)

        if args is not None:
            if isinstance(args, dict):
                args = dict((key, db.literal(item)) for key, item in args.items())
            else:
                args = tuple(map(db.literal, args))
            if not PY2 and isinstance(query, (bytes, bytearray)):
                query = query.decode(db.encoding)
            try:
                query = query % args
            except TypeError as m:
                self.errorhandler(self, ProgrammingError, str(m))

        if isinstance(query, unicode):
            query = query.encode(db.encoding, 'surrogateescape')

        res = None
        try:
            res = self._query(query)
        except Exception:
            exc, value = sys.exc_info()[:2]
>           self.errorhandler(self, exc, value)

/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:255: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

errorclass = <class '_mysql_exceptions.OperationalError'>
errorvalue = OperationalError(1213, 'Deadlock found when trying to get lock; try restarting transaction')

    def defaulterrorhandler(connection, cursor, errorclass, errorvalue):
        """
        If cursor is not None, (errorclass, errorvalue) is appended to
        cursor.messages; otherwise it is appended to
        connection.messages. Then errorclass is raised with errorvalue as
        the value.

        You can override this with your own error handler by assigning it
        to the instance.
        """
        error = errorclass, errorvalue
        if cursor:
            cursor.messages.append(error)
        else:
            connection.messages.append(error)
        del cursor
        del connection
        if isinstance(errorvalue, BaseException):
>           raise errorvalue

/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py:50: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f1fec948630>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'
args = ("'test_verify_integrity_if_dag_changed'", 'NULL', '0', '0', '1', 'NULL', ...)

    def execute(self, query, args=None):
        """Execute a query.

        query -- string, query to execute on server
        args -- optional sequence or mapping, parameters to use with query.

        Note: If args is a sequence, then %s must be used as the
        parameter placeholder in the query. If a mapping is used,
        %(key)s must be used as the placeholder.

        Returns integer represents rows affected, if any
        """
        while self.nextset():
            pass
        db = self._get_db()

        # NOTE:
        # Python 2: query should be bytes when executing %.
        # All unicode in args should be encoded to bytes on Python 2.
        # Python 3: query should be str (unicode) when executing %.
        # All bytes in args should be decoded with ascii and surrogateescape on Python 3.
        # db.literal(obj) always returns str.

        if PY2 and isinstance(query, unicode):
            query = query.encode(db.encoding)

        if args is not None:
            if isinstance(args, dict):
                args = dict((key, db.literal(item)) for key, item in args.items())
            else:
                args = tuple(map(db.literal, args))
            if not PY2 and isinstance(query, (bytes, bytearray)):
                query = query.decode(db.encoding)
            try:
                query = query % args
            except TypeError as m:
                self.errorhandler(self, ProgrammingError, str(m))

        if isinstance(query, unicode):
            query = query.encode(db.encoding, 'surrogateescape')

        res = None
        try:
>           res = self._query(query)

/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:252: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <MySQLdb.cursors.Cursor object at 0x7f1fec948630>
q = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def _query(self, q):
        db = self._get_db()
        self._result = None
>       db.query(q)

/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:378: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_mysql.connection open to 'mysql' at 5562e5aef8b8>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def query(self, query):
        # Since _mysql releases GIL while querying, we need immutable buffer.
        if isinstance(query, bytearray):
            query = bytes(query)
        if self.waiter is not None:
            self.send_query(query)
            self.waiter(self.fileno())
            self.read_query_result()
        else:
>           _mysql.connection.query(self, query)
E           _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py:280: OperationalError

The above exception was the direct cause of the following exception:

self = <tests.jobs.test_scheduler_job.TestSchedulerJob testMethod=test_verify_integrity_if_dag_changed>

    def test_verify_integrity_if_dag_changed(self):
        # CleanUp
        with create_session() as session:
            session.query(SerializedDagModel).filter(
                SerializedDagModel.dag_id == 'test_verify_integrity_if_dag_changed'
            ).delete(synchronize_session=False)

        dag = DAG(dag_id='test_verify_integrity_if_dag_changed', start_date=DEFAULT_DATE)
        BashOperator(task_id='dummy', dag=dag, owner='airflow', bash_command='echo hi')

        scheduler = SchedulerJob()
        scheduler.dagbag.bag_dag(dag, root_dag=dag)
>       scheduler.dagbag.sync_to_db()

tests/jobs/test_scheduler_job.py:2758: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
airflow/utils/session.py:63: in wrapper
    return func(*args, **kwargs)
airflow/models/dagbag.py:526: in sync_to_db
    DAG.bulk_write_to_db(self.dags.values(), session=session)
airflow/utils/session.py:59: in wrapper
    return func(*args, **kwargs)
airflow/models/dag.py:1847: in bulk_write_to_db
    session.flush()
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py:2523: in flush
    self._flush(objects)
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py:2664: in _flush
    transaction.rollback(_capture_exception=True)
/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py:69: in __exit__
    exc_value, with_traceback=exc_tb,
/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:182: in raise_
    raise exception
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py:2624: in _flush
    flush_context.execute()
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py:422: in execute
    rec.execute(self)
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py:589: in execute
    uow,
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py:245: in save_obj
    insert,
/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py:1083: in _emit_insert_statements
    c = cached_connections[connection].execute(statement, multiparams)
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1011: in execute
    return meth(self, multiparams, params)
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py:298: in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1130: in _execute_clauseelement
    distilled_params,
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1317: in _execute_context
    e, statement, parameters, cursor, context
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1511: in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:182: in raise_
    raise exception
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1277: in _execute_context
    cursor, statement, parameters, context
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py:593: in do_execute
    cursor.execute(statement, parameters)
/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:255: in execute
    self.errorhandler(self, exc, value)
/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py:50: in defaulterrorhandler
    raise errorvalue
/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:252: in execute
    res = self._query(query)
/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py:378: in _query
    db.query(q)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_mysql.connection open to 'mysql' at 5562e5aef8b8>
query = b'INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expire... {\\"days\\": 1, \\"seconds\\": 0, \\"microseconds\\": 0}}\', 16, 0, \'2016-01-01 00:00:00\', \'2016-01-02 00:00:00\')'

    def query(self, query):
        # Since _mysql releases GIL while querying, we need immutable buffer.
        if isinstance(query, bytearray):
            query = bytes(query)
        if self.waiter is not None:
            self.send_query(query)
            self.waiter(self.fileno())
            self.read_query_result()
        else:
>           _mysql.connection.query(self, query)
E           sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
E           [SQL: INSERT INTO dag (dag_id, root_dag_id, is_paused, is_subdag, is_active, last_scheduler_run, last_pickled, last_expired, scheduler_lock, pickle_id, fileloc, owners, description, default_view, schedule_interval, concurrency, has_task_concurrency_limits, next_dagrun, next_dagrun_create_after) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)]
E           [parameters: ('test_verify_integrity_if_dag_changed', None, 0, 0, 1, None, None, None, None, None, '/opt/airflow/tests/jobs/test_scheduler_job.py', 'airflow', None, 'tree', '{"type": "timedelta", "attrs": {"days": 1, "seconds": 0, "microseconds": 0}}', 16, 0, datetime.datetime(2016, 1, 1, 0, 0), datetime.datetime(2016, 1, 2, 0, 0))]
E           (Background on this error at: http://sqlalche.me/e/13/e3q8)

/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py:280: OperationalError```
kaxil commented 3 years ago

https://github.com/apache/airflow/pull/12046 should fix the issues, we can re-open this issue if we see it crop up again

potiuk commented 3 years ago

I would still like to take a closer look at this when I have time. To see if I can find something more - I understand this is reacting to the issue rather than preventing it, I think this is good for now for the user testing to implement retries for beta's and see if deadlock has some other effects, but I think this one should remain open at leat for me to take a look. I will assign it to myself and mark it as 2.0 so that I do not forget about it.