Closed Kingdutch closed 3 years ago
I've already seen this with one customer, but unfortunately, we didn't find the cause there yet...
This can probably sometimes happen when multiple updates are running in parallel, and your description seems to match that. I still wonder why we've never seen this on our production servers...
The proper long-term solution for this is to avoid holding key lock for updates, but that will be possible after Django 3.2 is released, see https://github.com/WeblateOrg/weblate/pull/5227
Glad to hear it's a known issue :) I'm not entirely sure what it's locking on. Could it have something to do with components that share a repository (maybe they also share other things)? We have quite a few of those.
No, it's lock on the units table. Weblate holds it to avoid concurrent updates of the units, but in current Django implementation it doesn't hold only row level lock, but also primary key lock. And that blocks insertion into that table for that time.
Should be addressed in Weblate 4.6 with Django 3.2.
Thank you for your report; the issue you have reported has just been fixed.
Describe the issue
We had a 4.3.1 instance that occasionally crashed on PO import causing the redis queue to go down with it and be lost. This problem has resolved itself after upgrading to 4.4.2. To update all our components we had to re-import PO files for quite a lot of components using
weblate loadpo --all --lang <code>
. This produced about ~270 celery tasks in the queue. We kept an eye on the log to see whether our earlier problem was resolved. In the log we saw the following deadlock error multiple times.The error does not actually cause any problems we can find within Weblate, but since it's what looks like a fatal error that was not yet reported on GitHub I wanted to report it for your information.
I already tried
Describe the steps you tried to solve the problem yourself.
To Reproduce the issue
Steps to reproduce the behavior:
weblate loadpo --all --lang <langcode>
form multiple languagesExpected behavior
No errors in the logs
Exception traceback
Other instances of this error showed different process numbers, forkpoolworker number and
tuple (x,y)
values but were otherwise identical.Server configuration and status
Weblate installation: Docker
Weblate deploy checks