mod_wsgi built with Python 3.10 doesn't respond to changes in Django templates

ebirck-css commented 1 year ago

Pardon if this is not the right project to post under, but we recently upgraded our Django server with Apache 2.4.41 and mod_wsgi 4.9.4 from Python 3.8 to 3.10, and changes to template (html) files no longer show up without an Apache reboot on Python 3.10. I am running mod_wsgi in Daemon mode, and both Python 3.8 and 3.10 have mod_wsgi 4.9.4. Are there any known issues or lacks of support between mod_wsgi and Python 3.10? Django and Apache versions are the same as well, so only difference is which build of mod_wsgi is used (and of course which Python version that in turn uses).

I created a StackOverflow post with more details here:

https://stackoverflow.com/questions/75316978/mod-wsgi-on-python-3-10-does-not-update-django-templates-without-apache-restart

Any help or direction would be appreciated, thanks!

GrahamDumpleton commented 1 year ago

There should be no noticeable difference in mod_wsgi versions.

What is the actual mod_wsgi configuration you are using?

ebirck-css commented 1 year ago

Hi Graham,

Thanks for the quick reply - my configuration within Apache is:

WSGIScriptAlias / /home/myhome/django/mysite/mysite/wsgi.py
WSGIPythonPath /home/myhome/django/mysite
WSGIDaemonProcess mysite.com python-path=/home/myhome/django/mysite
WSGIProcessGroup mysite.com

WSGIApplicationGroup %{GLOBAL}

...then that wsgi.py file is the default wsgi file as provided by Django:

import os

from django.core.wsgi import get_wsgi_application

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')

application = get_wsgi_application()

...detailed here: https://docs.djangoproject.com/en/4.1/howto/deployment/wsgi/

If there's anything else I can provide please let me know, thanks again!

GrahamDumpleton commented 1 year ago

I can only suggest try adding:

https://modwsgi.readthedocs.io/en/master/user-guides/debugging-techniques.html#tracking-request-and-response

so you can capture request/response from Python WSGI entry point level to confirm that requests do get there and you aren't seeing a problem where Apache or a proxy is itself caching responses (ie., not client side, but not in Django either).

ebirck-css commented 1 year ago

Thanks Graham - I added that debug logging, as far as I can tell the requests are identical between the two python versions except for expected variances (source ephemeral port and timestamp values).

One thing I did stumble across, simultaneously I have been trying to implement parallel processing to improve our performance by adding processes=x threads=y values to the WSGIDaemonProcess parameter, and noticed in Python 3.10 when I have processes=2 threads=1 each process caches its own version of the template. That is to say, I made a request, altered the template, made another request, and the altered value showed up, however if I then made additional requests - regardless of any additional template changes - the result returned would alternate between the two original states of the template. And the RESPONSE debugging messages showed the response sizes kept flip-flopping, but were otherwise identical.

Are sub processes all spawned at server startup? Or only the first time a request gets routed to them? That may help diagnose if something at the Django layer is caching on first load.

GrahamDumpleton commented 1 year ago

With your setup Django would be loaded the first time a request hits a daemon process.

How do you have Django's caching framework setup?

https://docs.djangoproject.com/en/4.1/topics/cache/

ebirck-css commented 1 year ago

I'm probably getting confused on processes vs subinterpreters, I'm new to the multiprocessing / multithreading features of mod_wsgi ... but since the first change between request 1 and 2 gets rendered in request 2, with requests 3 and 4 re-serving the seemingly cached result of 1 and 2, respectively, doesn't that imply if caching is happening by Django on first load that we must not be loading the second instance of Django until request 2?

I didn't have any caching framework setup within Django, we handle any cache-able items ourself within our views.py logic using global variables and/or local database caching. I did just try implementing their dummy caching detailed here:

https://docs.djangoproject.com/en/4.1/topics/cache/#dummy-caching-for-development

...but that didn't make a difference, same behavior.

ebirck-css commented 1 year ago

Okay a little more testing, seems the caching is on a per-file basis, the first time it is served within each process. I added a second test file, and got the following behavior:

Initial setup: both files contain "Initial State" Request File 1:

"Initial State" Alter File 1 to "Second State" Request File 1:
"Second State" Alter File 1 to "Third State" Request File 1:
"First State" Request File 1:
"Second State" ... file 1 no longer responds to changes ... Alter File 2 to "Second State" Request File 2:
"Second State" Alter File 2 to "Third State" Request File 2:
"Third State" Alter File 2 to "Fourth State" Request File 2:
"Second State" Request File 2:
"Third State" ... similarly no longer responds to changes ...

So despite confirming that both processes exist and have already set their cached values for File 1, File 2 was not yet cached and still allowed for a change between the first two requests of specifically File 2.

GrahamDumpleton commented 1 year ago

Since you are using:

WSGIApplicationGroup %{GLOBAL}

This forces the use of the main Python interpreter context of any process thus making it like running command line Python. Each daemon process is therefore the same as if you had created separate Python command line processes. You can ignore anything about Python sub interpreters.

Ensure you have:

LogLevel info

and not just err or warn. That what you will see in the Apache logs when daemon processes are initialised and when the WSGI application is first loaded by a daemon process.

As already noted, the WSGI application will with your config be lazily loaded the first time a request is directed to a daemon process, even though all daemon processes are created when Apache starts.

If you wanted to change it so that the WSGI application is loaded as soon as the process is started, before even any request is sent to it, use config of:

WSGIDaemonProcess mysite.com python-path=/home/myhome/django/mysite
WSGIScriptAlias / /home/myhome/django/mysite/mysite/wsgi.py \
    process-group=mysite.com application-group=%{GLOBAL}

BTW, do you not use a Python virtual environment?

ebirck-css commented 1 year ago

Updated the LogLevel to info (was previously on warn)

I get the following for mod_wsgi on server startup:

2 x "Starting process 'mysite.com' with uid=33, gid=33 and threads=1" 4 x "Initializing Python." 4 x "Attach Interpreter ''." 4 x "Adding '/home/myhome/django/mysite' to path." 4 x "Imported 'mod_wsgi'."

...and once I make a request I get:

Create interpreter 'mysite.com:443|'. Adding 'home/myhome/django/mysite' to path.' mod_wsgi (pid=895184, process='mysite.com', application='mysite.com:443|'): Loading Python script file '/home/myhome/django/mysite/mysite/wsgi.py'.

... on the first two requests, so when the first request is directed to each, as you explained.

I think given the behavior above with two files, it's safe to assume the load itself doesn't initiate any caching, since files are not cached until they are served - regardless of how long ago the process / interpreter was started.

Regarding virtual environment, no, we do not have a python virtual environment on this server.

I'll keep looking into Django caching options, I get the feeling we're hitting a wall on this ... though I'm certainly open to other ideas and greatly appreciate all the help thus far!

GrahamDumpleton commented 1 year ago

To get rid of some of noise due to Python unnecessarily being initialised in Apache child processes (different to daemon processes), add outside of all VirtualHost definitions.

WSGIRestrictEmbedded On

You don't need Python in Apache child process since WSGI requests handled by mod_wsgi daemon processes.

ebirck-css commented 1 year ago

Okay I think I finally found the culprit; since the mod_wsgi REQUEST log seemed to be as expected, I just followed the call chain deep into the Django source code and found the following in the template Engine class in /django/template/engine.py:

if loaders is None:
    loaders = ["django.template.loaders.filesystem.Loader"]
    if app_dirs:
        loaders += ["django.template.loaders.app_directories.Loader"]
        loaders = [("django.template.loaders.cached.Loader", loaders)]
else:
    if app_dirs:
    raise ImproperlyConfigured(
        "app_dirs must not be set when loaders is defined."
    )

...comparing to the equivalent in Python 3.8, it used to be:

if loaders is None:
    loaders = ['django.template.loaders.filesystem.Loader']
    if app_dirs:
        loaders += ['django.template.loaders.app_directories.Loader']
    if not debug:
        loaders = [('django.template.loaders.cached.Loader', loaders)]
else:
    if app_dirs:
        raise ImproperlyConfigured(
            "app_dirs must not be set when loaders is defined.")

Both sources codes were updated to the latest Django version as of this post:

>>> python3.8 -m pip freeze | grep Django
Django==4.1.6
>>> python3.10 -m pip freeze | grep Django
Django==4.1.6

... I couldn't find any documentation of this change, other than the original request to implement it as it was. Regardless, I currently have debug=True on this server so in Python 3.10 that is defaulting template loading to django.template.loaders.cached.Loader, where as in Python 3.8 it was respecting my debug value and disabling caching.

If you want caching disabled with Python 3.10, you can override the loaders in Django settings.py:

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': [],
        'OPTIONS': {
            'context_processors': [
                ...
            ],
            'loaders': ['django.template.loaders.filesystem.Loader', 'django.template.loaders.app_directories.Loader']
        },
    },
]

Note: APP_DIRS must also be disabled, if set to True - this is handled by hard coding the app_directories Loader

Thanks again Graham for all the assistance, and apologies for spinning you up on something that boiled down to a Django issue. Cheers!

ebirck-css commented 1 year ago

Ah, here's the django commit that changed that.

Not sure why my Python 3.8 install doesn't reflect the change, maybe a fresh install wouldn't have that discrepancy issue. 🤷

GrahamDumpleton / mod_wsgi

mod_wsgi built with Python 3.10 doesn't respond to changes in Django templates #821