aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
436 stars 189 forks source link

Trouble migrating database: cannot ALTER TABLE "db_dblog" because it has pending trigger events #3795

Closed DanielMarchand closed 1 year ago

DanielMarchand commented 4 years ago

I isolated the migration database completely from my current production database, see below. I also cloned the repository, and have configured the config.json appropriately config.json.

$postgres=# CREATE DATABASE aiidadb_v1 WITH TEMPLATE aiidadev_v1a4 OWNER aiida;
CREATE DATABASE
$postgres=# GRANT ALL PRIVILEGES ON DATABASE aiidadb_v1 to aiida;
GRANT

I tried to migrate using the following commands.


(aiida_v1.1) daniel ~ $ verdi database migrate -f

Warning:
************************************************************************************************************************
Found one or multiple records that violate the integrity of the database
Violation reason: detected workflow nodes with isolated `create` links.
Performed action: nothing
Violators written to: /home/daniel/migration-paib1dbl.log
************************************************************************************************************************

We found 0 log records that correspond to legacy workflows and 0 log records to correspond to an unknown entity.
These records will be removed from the database and exported to JSON files to the current directory).
Would you like to proceed? [Y/n]: y
Exported entity logs that don't correspond to nodes to /home/daniel/no_node_entity_logs-h6rertgw.log

I eventually get this traceback:

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/daniel/virtualenvs/aiida_v1.1/bin/verdi", line 10, in <module>
    sys.exit(verdi())
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/cmdline/commands/cmd_database.py", line 43, in database_migrate
    backend.migrate()
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/orm/implementation/sqlalchemy/backend.py", line 47, in migrate
    self._schema_manager.migrate()
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/backends/manager.py", line 145, in migrate
    self._migrate_database_version()
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/backends/sqlalchemy/manager.py", line 147, in _migrate_database_version
    command.upgrade(config, 'head')
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/command.py", line 298, in upgrade
    script.run_env()
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/script/base.py", line 489, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/util/pyfiles.py", line 98, in load_python_file
    module = load_module_py(module_id, path)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/util/compat.py", line 173, in load_module_py
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/backends/sqlalchemy/migrations/env.py", line 75, in <module>
    run_migrations_online()
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/backends/sqlalchemy/migrations/env.py", line 68, in run_migrations_online
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/runtime/environment.py", line 846, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/runtime/migration.py", line 518, in run_migrations
    step.migration_fn(**kw)
  File "/home/daniel/Install/aiida_installs/v1.1/aiida-core/aiida/backends/sqlalchemy/migrations/versions/ea2f50e7f615_dblog_create_uuid_column.py", line 66, in upgrade
    'db_dblog', sa.Column('uuid', postgresql.UUID(), autoincrement=False, nullable=True, default=get_new_uuid)
  File "<string>", line 8, in add_column
  File "<string>", line 3, in add_column
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/operations/ops.py", line 1927, in add_column
    return operations.invoke(op)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/operations/base.py", line 374, in invoke
    return fn(self, operation)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/operations/toimpl.py", line 132, in add_column
    operations.impl.add_column(table_name, column, schema=schema, **kw)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/ddl/impl.py", line 237, in add_column
    self._exec(base.AddColumn(table_name, column, schema=schema))
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/alembic/ddl/impl.py", line 140, in _exec
    return conn.execute(construct, *multiparams, **params)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
    return connection._execute_ddl(self, multiparams, params)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1044, in _execute_ddl
    compiled,
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
    cursor, statement, parameters, context
  File "/home/daniel/virtualenvs/aiida_v1.1/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.errors.ObjectInUse) cannot ALTER TABLE "db_dblog" because it has pending trigger events

[SQL: ALTER TABLE db_dblog ADD COLUMN uuid UUID]
(Background on this error at: http://sqlalche.me/e/e3q8)

As per @sphuber suggestion, I reset my computer and tried all the steps again (dropping and re-cloning the aiidadb_v1 , as detailed above). Yet the error persists...

giovannipizzi commented 4 years ago

I think you had issues with your computer (and after a reboot, you couldn't even restart it). Maybe it's just this? If this is the case, can you close this issue?

sphuber commented 4 years ago

I think you had issues with your computer (and after a reboot, you couldn't even restart it). Maybe it's just this? If this is the case, can you close this issue?

I suggested rebooting machine as I suspected the same problem but @DanielMarchand says he did and the problem persists. Really don't now what could be causing it.

DanielMarchand commented 4 years ago

Right, after I solved the issues with my computer and rebooted it, I have the same error.

DanielMarchand commented 4 years ago

Hi Everyone,

We're currently in the process of submitting for publication and I would really like to provide an up-to-date version of the database. I have a lot of time I can invest but I might need more guidance for debugging this issue.

Maybe if I could get a couple of suggestions as to thinks I could try or look into that would be of great help.

Best,

Daniel

DanielMarchand commented 4 years ago

A couple of thoughts:

1) Is there a way I can increase the verbosity/logging level? 2) Maybe there is a way I try to isolate the specific node(s) that are responsible for this issue?

Daniel

DanielMarchand commented 4 years ago

Something that really strikes me is that the last point of contact from AiiDA prior to the crash is the migration script ea2f50e7f615_dblog_create_uuid_column.py, specifically line 66:

 59 def upgrade():
 60     """ Add an UUID column an populate it with unique UUIDs """
 61     from aiida.common.utils import get_new_uuid
 62     connection = op.get_bind()
 63     
 64     # Create the UUID column
 65     op.add_column(
 66         'db_dblog', sa.Column('uuid', postgresql.UUID(), autoincrement=False, nullable=True, defa
 67     )
 68 
 69     # Populate the uuid column
 70     set_new_uuid(connection)

wonder if maybe there are some faulty uuid's in my database, e..g duplicates? [I remember about a year ago I caused some odd bugs, and we had to do some dark sorcery to bring it back, so I"m sure any of it is possible...]

Best,

Daniel

sphuber commented 4 years ago

To check if there are duplicate UUIDs, you should be able to run verdi database integrity detect-duplicate-uuid. Can you remind me what version of aiida-core you currently have installed? And it is an sqlalchemy database, correct?

DanielMarchand commented 4 years ago

Thanks for the speedy response!

$verdi database integrity detect-duplicate-uuid
Info: no duplicate UUIDs found
Success: dry-run of integrity patch completed

So I think that is OK.

I'm trying to migrate from v1.0.0a4 to v1.1. I can update to v1.3 if you think it might help.

EDIT: Tried again with v1.3 but get the same error.

DanielMarchand commented 4 years ago

I just noticed that if I run

$verdi database integrity detect-invalid-links
Critical: 
Database schema version `5d4d844852b6` is incompatible with the required schema version `118349c10896`.
To migrate the database schema version to the current one, run the following command:

    verdi -p aiida_v1 database migrate

Is there a difference between verdi database migrate -f and verdi -p aiida_v1 database migrate

sphuber commented 4 years ago

The -f flag just skips the prompt for confirmation and the -p flag lets you specify a specific profile if you want a profile other than the current default. So if you have just one profile, it should change nothing.

DanielMarchand commented 4 years ago

The file where the error occurs was committed here. 78248e11f05da9f5edd07e0d64b54cd47e7fa43f

The commit log is rather detailed and may contain clues to solve the issue.

sphuber commented 4 years ago

Yeah, I spent quite a lot of time on that commit message, because the fix was rather intricate and it required quite a bit of context to fully grasp what was going on. I have went back and I don't think the problem you see is really caused by the migrations. We have had lots of people migrate beyond this point without any problems.

Looking at the traceback, the message is quite clear: the migration that we are running when it fails needs to update the db_dblog table. However, it is currently locked, because something is currently executing changes on it, or somehow has executed those and they haven't finished yet. This is either some other process, which shouldn't happen since you should have stopped your daemon and all other processes, or it is the previous migration. The former seems unlikely since we asked you to reboot the machine to make sure nothing else was running. The only possibility that I then see is that, somehow, the migration preceding the one that triggers the exception has not finished all its actions before the next one is called. This would seem to indicate a bug in alembic which is the migration manager we use for SqlAlchemy databases. Ultimately there is really nothing I can do from this stack trace though. There is also no setting to change to get more logging. It is simply psycopg2 that is refusing the operation at some point.

Would it be possible for me to get access to the machine somehow? Feel free to contact me over email to organize this. Or do you have access to the AiiDA Slack workspace? That would be even better. I would have some time to look into this tomorrow.

sphuber commented 3 years ago

Note to self: a similar problem cropped up in the database migration with revision 0edcdd5a30f0 which added the extras column for the Group class. The same problem was encountered on SqlA that if group entities exist in the database, the migration would fail if at least two database revisions were performed in one go. If only that revision was applied in isolation, there was no problem. Only if the preceding revision was applied in the same transaction, the problem of pending trigger events was observed. The problem was somewhat fixed in d60491656dd00ad709a17eec76b4f403144c52dc but really this is a workaround where the migration itself was changed.

See issue #4590