jazzband / django-simple-history

Store model history and view/revert changes from admin site.
https://django-simple-history.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.23k stars 483 forks source link

"database is locked" SQLite error with `clean_duplicate_history` command #973

Open tannercollin opened 2 years ago

tannercollin commented 2 years ago

I noticed every once in a while my app would log a "database is locked" error:

[2022-01-26 03:45:25,580] [22345] [  ERROR] Internal Server Error: /cards/
[2022-02-02 04:45:51,686] [26440] [  ERROR] Internal Server Error: /sessions/
[2022-02-05 18:45:49,874] [31872] [  ERROR] Internal Server Error: /stats/usage/
[2022-02-08 22:45:57,367] [978] [  ERROR] Internal Server Error: /stats/usage/
[2022-02-10 04:45:49,758] [15725] [  ERROR] Internal Server Error: /stats/usage/
[2022-02-19 20:45:46,361] [28267] [  ERROR] Internal Server Error: /rest-auth/login/
[2022-03-09 12:45:45,934] [27711] [  ERROR] IPN route - OperationalError - database is locked
[2022-03-14 10:45:43,600] [27708] [  ERROR] IPN route - OperationalError - database is locked
[2022-03-20 00:45:27,309] [13434] [  ERROR] Internal Server Error: /stats/usage/
[2022-03-23 02:45:46,301] [13435] [  ERROR] Internal Server Error: /stats/usage/
[2022-03-30 02:45:56,876] [13435] [  ERROR] Internal Server Error: /stats/usage/
[2022-03-30 03:45:44,279] [13432] [  ERROR] Internal Server Error: /stats/usage/
[2022-03-31 15:45:36,220] [4628] [  ERROR] Internal Server Error: /stats/usage/
[2022-04-01 02:45:26,084] [4629] [  ERROR] Internal Server Error: /stats/usage/
[2022-04-09 17:45:46,624] [24482] [  ERROR] Internal Server Error: /stats/usage/

When it happened, it was always at the 45th minute. This is when I run the clean_duplicate_history command with cron:

45 * * * * /opt/[snip]/python /opt/[snip]/manage.py clean_duplicate_history -m 60 --auto >> /var/log/[snip]-cron.log 2>&1 

This is a difficult situation because:

I suspect this atomic transaction is what's taking so long. Would it be possible to break it into pieces so that other threads could access the database?

Or is there a different solution you guys recommend? Thanks!

cmaggiulli commented 10 months ago

I realize this is an old issue but here is what I see from your issue:

  1. You have a cron that runs every 45 minutes. It executes a management command that creates a database connection
  2. Your process is failing at minute 45
  3. This is because your first command is still running while the crontab has invoked another execution
  4. Since SQLite can only handle a single connection at a time, and since crontab is spawning another connection, you are seeing a database lock error
tannercollin commented 9 months ago

Thanks for the reply. It's not practical to stop my app to run the clean_duplicate_history command each night because I might miss an important request like a PayPal payment notification. It needs to stay running.

The clean_duplicate_history was locking the database for too long, which is why I filed this issue.

cmaggiulli commented 8 months ago

I believe you're misunderstanding me. I am not recommending you stop your app. I'm pointing out the error is happening because you have two processes writing to a database that does a full lock during write. There is no solution other than migrating off SQLite. This is one of many reasons SQLite is for local development only

tannercollin commented 8 months ago

Yes, I understand that. My issue described one solution:

I suspect this atomic transaction is what's taking so long. Would it be possible to break it into pieces so that other threads could access the database?

SQLite is actually fine for production use: https://www.sqlite.org/whentouse.html

Each writer is supposed to hold a lock for a short period of time, not many seconds like the clean_duplicate_history command. Other writers queue up and wait their turn. The "database is locked" exception is raised when another writer waits for longer than a set timeout (in my case 20 seconds).

Note that the Django docs specifically call this out:

Performance considerations

Open transactions have a performance cost for your database server. To minimize this overhead, keep your transactions as short as possible. This is especially important if you’re using atomic() in long-running processes, outside of Django’s request / response cycle.