Closed taldcroft closed 6 years ago
Ugh. I thought I could reliably lock the database with my guide stats processing; I'll try that again today and see if I can lock a copy.
This PR seems a fine way to at least try again if for some reason the database was locked by reasonable small query or netapp weirdness.
Maybe try the instructions here: https://stackoverflow.com/questions/14272715/how-can-i-lock-a-sqlite-database
I think I looked at that when we started using sqlite but ran into issues with pragma support by sqlite version. Will look again.
On a locked local copy of events.db via
PRAGMA locking_mode = EXCLUSIVE;
BEGIN EXCLUSIVE;
it looks like I'm getting "django.db.utils.OperationalError: database is locked" without seeing any of the
logger.info('Warning: locked database, waiting {} seconds'.format(delay))
I'd expect from this PR.
Grrrr. Will investigate.
Not clear to me if the exclusive lock via pragma really replicates what we've hit upon by accident in processing though (if the exclusive lock is also a read lock and maybe the lock we're getting is only a write lock?).
From what you said it is raising what appears to be the same OperationalError
without it being caught and handled, or else the database is locked
is not matching. If you are working on this, can you try putting in some print statements to see what is happening?
From the traceback, it looks like it is throwing the error at
https://github.com/sot/kadi/blob/3de2e19b0b0ac48c28f88a61a5c354804adc1ea9/kadi/update_events.py#L119
when trying to do that get
, not when trying to save.
From looking through our kadi OperationalError cron task emails, it doesn't look to me like that has happened for us in practice (the errors we see are on save or delete), so I think the exclusive pragma lock might just be a bad test for this.
Ah, so now I understand your comment about read vs write. Agreed that the exclusive lock is no good, though you might wrap that particular get
in a try4times
and see what it does. (Noting that try4imes
does not return the func
result, which should actually be changed).
This is a workaround for problems like:
This PR has been tested by running the
ska_testr/packages/kadi/test_regression_long.sh
test, once via ska_testr in Ska flight and once by hand in the repo for this PR. Outputs were diffed and no diffs were seen.I have been unable to reproduce the databased locked problem. I tried locally starting the process to update the database over a 30 day period, then in a separate window repeatedly query the events database. No problem, no joy. So most of this code is not tested and just needs to be looked at carefully.
Plan B was to make a local copy of the events database, update that, and then copy back. But even then it would be better to not have the whole update process stop cold. My feeling is to try this and see if it works. We'll still see warnings pop up if the database was locked.