Closed DanielMarchand closed 1 year ago
I think you had issues with your computer (and after a reboot, you couldn't even restart it). Maybe it's just this? If this is the case, can you close this issue?
I think you had issues with your computer (and after a reboot, you couldn't even restart it). Maybe it's just this? If this is the case, can you close this issue?
I suggested rebooting machine as I suspected the same problem but @DanielMarchand says he did and the problem persists. Really don't now what could be causing it.
Right, after I solved the issues with my computer and rebooted it, I have the same error.
Hi Everyone,
We're currently in the process of submitting for publication and I would really like to provide an up-to-date version of the database. I have a lot of time I can invest but I might need more guidance for debugging this issue.
Maybe if I could get a couple of suggestions as to thinks I could try or look into that would be of great help.
Best,
Daniel
A couple of thoughts:
1) Is there a way I can increase the verbosity/logging level? 2) Maybe there is a way I try to isolate the specific node(s) that are responsible for this issue?
Daniel
Something that really strikes me is that the last point of contact from AiiDA prior to the crash is the migration script ea2f50e7f615_dblog_create_uuid_column.py
, specifically line 66:
59 def upgrade():
60 """ Add an UUID column an populate it with unique UUIDs """
61 from aiida.common.utils import get_new_uuid
62 connection = op.get_bind()
63
64 # Create the UUID column
65 op.add_column(
66 'db_dblog', sa.Column('uuid', postgresql.UUID(), autoincrement=False, nullable=True, defa
67 )
68
69 # Populate the uuid column
70 set_new_uuid(connection)
wonder if maybe there are some faulty uuid's in my database, e..g duplicates? [I remember about a year ago I caused some odd bugs, and we had to do some dark sorcery to bring it back, so I"m sure any of it is possible...]
Best,
Daniel
To check if there are duplicate UUIDs, you should be able to run verdi database integrity detect-duplicate-uuid
.
Can you remind me what version of aiida-core
you currently have installed? And it is an sqlalchemy
database, correct?
Thanks for the speedy response!
$verdi database integrity detect-duplicate-uuid
Info: no duplicate UUIDs found
Success: dry-run of integrity patch completed
So I think that is OK.
I'm trying to migrate from v1.0.0a4 to v1.1. I can update to v1.3 if you think it might help.
EDIT: Tried again with v1.3 but get the same error.
I just noticed that if I run
$verdi database integrity detect-invalid-links
Critical:
Database schema version `5d4d844852b6` is incompatible with the required schema version `118349c10896`.
To migrate the database schema version to the current one, run the following command:
verdi -p aiida_v1 database migrate
Is there a difference between verdi database migrate -f
and verdi -p aiida_v1 database migrate
The -f
flag just skips the prompt for confirmation and the -p
flag lets you specify a specific profile if you want a profile other than the current default. So if you have just one profile, it should change nothing.
The file where the error occurs was committed here. 78248e11f05da9f5edd07e0d64b54cd47e7fa43f
The commit log is rather detailed and may contain clues to solve the issue.
Yeah, I spent quite a lot of time on that commit message, because the fix was rather intricate and it required quite a bit of context to fully grasp what was going on. I have went back and I don't think the problem you see is really caused by the migrations. We have had lots of people migrate beyond this point without any problems.
Looking at the traceback, the message is quite clear: the migration that we are running when it fails needs to update the db_dblog
table. However, it is currently locked, because something is currently executing changes on it, or somehow has executed those and they haven't finished yet. This is either some other process, which shouldn't happen since you should have stopped your daemon and all other processes, or it is the previous migration. The former seems unlikely since we asked you to reboot the machine to make sure nothing else was running. The only possibility that I then see is that, somehow, the migration preceding the one that triggers the exception has not finished all its actions before the next one is called. This would seem to indicate a bug in alembic
which is the migration manager we use for SqlAlchemy databases. Ultimately there is really nothing I can do from this stack trace though. There is also no setting to change to get more logging. It is simply psycopg2 that is refusing the operation at some point.
Would it be possible for me to get access to the machine somehow? Feel free to contact me over email to organize this. Or do you have access to the AiiDA Slack workspace? That would be even better. I would have some time to look into this tomorrow.
Note to self: a similar problem cropped up in the database migration with revision 0edcdd5a30f0
which added the extras
column for the Group
class. The same problem was encountered on SqlA that if group entities exist in the database, the migration would fail if at least two database revisions were performed in one go. If only that revision was applied in isolation, there was no problem. Only if the preceding revision was applied in the same transaction, the problem of pending trigger events was observed. The problem was somewhat fixed in d60491656dd00ad709a17eec76b4f403144c52dc but really this is a workaround where the migration itself was changed.
See issue #4590
I isolated the migration database completely from my current production database, see below. I also cloned the repository, and have configured the config.json appropriately config.json.
I tried to migrate using the following commands.
I eventually get this traceback:
As per @sphuber suggestion, I reset my computer and tried all the steps again (dropping and re-cloning the
aiidadb_v1
, as detailed above). Yet the error persists...