jdelic / django-dbconn-retry

Patches Django to reconnect on a failed database connection once before failing. Helping with running Django ORM through HAProxy, for example.
BSD 3-Clause "New" or "Revised" License
75 stars 25 forks source link

Check if db connection is usable #9

Closed antarcticrainforest closed 1 year ago

antarcticrainforest commented 2 years ago

This adds an additional check if the database connection is still usable.

Background

Our django application successfully opens a db connection and does some calculations. These calculations might take a long time. In some cases more time than the wait_time of the db server (which is 8 hours by default). In such cases the db server cuts a valid connection. I've noticed that this case is not covered by the patch, so I added an additional check to see if the database connection is still usable. If it isn't a new connection will be established.

Here is a minimal example demonstrating the problem after the database servers wait_timeout variable has been set to 5 seconds:

import time

import django
from django.conf import settings
django_settings = dict()
django_settings["INSTALLED_APPS"] = (
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django_dbconn_retry",
)
django_settings["DATABASES"] = {
    "default": {
        "ENGINE": "django.db.backends.mysql",
        "NAME": "db_name",
        "USER": "user_name",
        "PASSWORD": "user_pw",
        "HOST": "127.0.0.1",
        "PORT": "3306",
    }
}
settings.configure(**django_settings)
django.setup()
from django.contrib.auth.models import User

print(User.objects.all())
print("Sleeping for 5.5 seconds")
time.sleep(5.5)
print(User.objects.all())

without proposed change I observe the following behaviour:

<QuerySet [<User: username>]>
Sleeping for 5.5 seconds
Traceback (most recent call last):
   _mysql.connection.query(self, query)
MySQLdb.OperationalError: (2013, 'Lost connection to server during query')

with the change we get the expected behaviour:

<QuerySet [<User: username>]>
Sleeping for 5.5 seconds
<QuerySet [<User: username>]>

I couldn't get the unittest working, could someone give me instructions on how to run those tests. Maybe we could also add instructions on running tests to the readme.

antarcticrainforest commented 2 years ago

@jdelic ready for review, thanks.

antarcticrainforest commented 2 years ago

I have seen that some PR's were merged. I've updated this code accordingly.

jdelic commented 2 years ago

@antarcticrainforest Thank you! Yes, I did some maintenance work on the project to get it to a place where it can be updated :). I'll look into this next. From what I faintly remember, I avoided calling is_usable because if it raises an exception it might recurse into ensure_connection_with_retries or something like that...

I'll probably try this patch in production on my end and see what happens.

jdelic commented 2 years ago

and indeed the tests fail...

I can't look into that right now, but perhaps later this week.

jdelic commented 2 years ago

(if you have time to look into it, that would also be appreciated ;) )

antarcticrainforest commented 2 years ago

I did have a quick look into what's going on but didn't find an obvious solution. If I have time today, I'll keep digging.