Closed daltonmatos closed 7 years ago
Just a question first: does taking out the preload option help? I think I remember another similar issue being reported. If changing that option helps, then we must have changed something about how Django applications are loaded and it will be easier for us to track it down.
I think we import the django wsgi application in the master process, which at least on django 1.7, initializes all of the django models on import.
Hello @tilgovi, thank you for your time. I cant confirm that using preload_app = False
I'm able to use gunicorn-19.1.1
and django-1.7.1
.
Do you have any idea what gunicorn cloud be doing differently about this in versions 19.x
?
I have now a second problem. I use Raven[1] in my django app. And accorging to the Raven docs, when using a WSGU middleware[1] it is possible to use Raven as a WSGI object (wrapping the original django WSGI app), and when using gunicorn (as an external command, not the run_gunicorn
django command) we must add a hook to gunicorn[2]. When this hook is enabled, gunicorn only boots if preload_app = True
, but then gevent does not work. When I use preload_app = True
with the hook enabled, gunicorn does not even boot, giving me and ImportError
saying it cannot import my "DJANGO_SETTINGS_MODULE".
Raven is 4.2.3
For now I will get back to gunicorn 18.x
, since it works with both configurations (gevent and raven as a WSGI middleware)
Should I open another Issue about this?
[1] http://raven.readthedocs.org/en/latest/config/django.html#wsgi-middleware [2] http://raven.readthedocs.org/en/latest/config/django.html#gunicorn
@daltonmatos the way databases are setup in django 1.7.1 has probably changed are apparently not threadsafe. I am not sure what to do at that point.
One more thing, just found out today, session handling in django 1.7 with gunicorn 19.3.0 expires within seconds.
Had to switch back to 18 to make it work.
FYI.
This seems to be the same issue as https://github.com/benoitc/gunicorn/issues/879, which was closed but not fixed. It makes gunicorn 19.x. unusable with Django. Perhaps only for the gevent worker though? I haven't tested for other worker types.
@tilgovi i wonder if it's not due to the change done in the environ. in 19.x we set wsgi.multithread
to true
where it was false
in 18.x :
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/ggevent.py#L99
This change may be handledd differently in django. And a greenlet is not really a thread. Thoughts? At least maybe we could try to revert it and check if it solves the issue. Do we have a test or a reproducible way for this issue?
Oh, I bet that is the issue. Good memory, @benoitc.
I still don't know why yet. As I said at the time to justify the change, I can't imagine that it would make anything break because it would mean that the framework would only take more precautions about sharing data. Evidently, I may be wrong.
I pushed the branch fix/927
to check if the fix above works. Please test!
I'm still able to reproduce this error using fix/927 with django 1.8.1 (and 1.7.1) and gevent 1.0.1
That's reassuring to me, even if we still don't know the cause.
So I tried using the example application and didn't reproduce the issue running the following command line:
gunicorn --env DJANGO_SETTINGS_MODULE=testing.settings -k gevent -w3 wsgi:application
Can anyone provides me a reproducible example?
I can still reproduce this issue invoking gunicorn like so:
gunicorn --env DJANGO_SETTINGS_MODULE=settings -w 3 --max-requests=1000 -b 0.0.0.0:8000 -k gevent --config ogs/gunicorn.py --pythonpath ogs wsgi:application
Is there any relevant details you need?
@brockhaywood should be project.settings
, but yes i need to have a way to reproduce it, do you have any minimal code to share.
Looking at the trace above, it could also be the monkey patching. Maybe threads shouldn't be monkey patched. Can you try the following patch?
--- a/gunicorn/workers/ggevent.py
+++ b/gunicorn/workers/ggevent.py
@@ -60,9 +60,9 @@ class GeventWorker(AsyncWorker):
# if the new version is used make sure to patch subprocess
if gevent.version_info[0] == 0:
- monkey.patch_all()
+ monkey.patch_all(thread=False)
else:
- monkey.patch_all(subprocess=True)
+ monkey.patch_all(subprocess=True, thread=False)
# monkey patch sendfile to make it none blocking
patch_sendfile()
Using the above patch with the following wsgi no longer produces the specified error:
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "settings")
from gevent import monkey
monkey.patch_all(thread=False)
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
from django.core.cache.backends.memcached import BaseMemcachedCache
BaseMemcachedCache.close = lambda self, **kwargs: None
I'm now seeing a different and new error but have no idea if it's related:
ProgrammingError: close cannot be used while an asynchronous query is underway
This is occurring, 1in 10/15 requests is doing a count on a model. I suspect this is an error in my application which is now uncovered.
@brockhaywood cool! btw ahat's your version of gevent? For your latest error it dies not seem related to gunicorn indeed.
Yah, good news! On version 1.0.1 of gevent.
@tilgovi so I wonder if we shouldn't patch gunicorn removing the thread monkey patching. Maybe we should also set wsgi.multithread
to False
? Thoughts?
@benoitc I'm able to reproduce the new error that I referenced above with a very simple Django application that has just one view and does a simple model query and count if I run multiple concurrent requests.
Do you have any suggestions as to which component is the most likely culprit? Is this likely a gevent problem or psycogreen?
More likely psycogreen. Are you using such hook to setup it?
def def_post_fork(server, worker):
from psycogreen.gevent import psyco_gevent
psyco_gevent.make_psycopg_green()
worker.log.info("Made Psycopg Green")
Oh, I'm actually using
def post_fork(server, worker):
from psycogreen.gevent import patch_psycopg
patch_psycopg()
I'll try your suggestion.
Your code is more accurate if I read the code of psycogreen:
Ok, I'll try opening a ticket with psycogreen
can you try to use the hook post_worker_init
instead of post_fork
?
Still occurs for me when patching psycogreen in post_worker_init.
I've submitted a ticket to psycogreen as well: https://bitbucket.org/dvarrazzo/psycogreen/issue/2/databaseerror-used-with-asynchronous-query
@brockhaywood OK :/ Also found this link: https://bitbucket.org/dvarrazzo/psycogreen-hg/issue/1/databaseerror-execute-used-with . I am not sure if it could help you though.
I did take a look at that and it seems like the same problem but I didn't see a resolution, unfortunately.
@brockhaywood can you share any simple example of code that could help me to reproduce the issue? I will have a look on it tomorrow.
Here the simple project that I'm using to reproduce: https://github.com/brockhaywood/gunicorn_gevent_issue
thanks!
On Sun, May 17, 2015 at 10:34 PM brockhaywood notifications@github.com wrote:
Here the simple project that I'm using to reproduce: https://github.com/brockhaywood/gunicorn_gevent_issue
— Reply to this email directly or view it on GitHub https://github.com/benoitc/gunicorn/issues/927#issuecomment-102852497.
@benoitc any thoughts on this?
@brockhaywood sorry i didn't have time to test it. Since I'm locked in a flight tomorrow, I will anyway :)
Thanks On 26 May 2015 1:17 pm, "Benoit Chesneau" notifications@github.com wrote:
@brockhaywood https://github.com/brockhaywood sorry i didn't have time to test it. Since I'm locked in a flight tomorrow, I will anyway :)
— Reply to this email directly or view it on GitHub https://github.com/benoitc/gunicorn/issues/927#issuecomment-105652947.
Was just experiencing the same DatabaseWrapper issue when I saw this thread.
I can confirm that removing the preload_app
option from my gunicorn config file fixes the problem. I'm using:
(jumping in since we've been seeing the same issues as described in this thread)
It looks like these issues are being caused by the fix for issue #478 (don't monkey patch the master).
By not monkey patching the master, app preload fundamentally creates a mismatch if you're using gevent workers. Any app code which runs as part of the preload runs in an environment that is not monkey patched. Any app code which runs in the worker runs in a monkey patched environment.
For us, just restoring the setup
method to the gevent
worker class as it's defined in 18.x, so that the master is monkey patched and it is monkey patched prior to app preload, fixed all our issues. This includes the DatabaseWrapper
bug, as well as another issue we were seeing where gevent workers would hang and timeout. No other changes were necessary (we do not have to disable monkey patching threads
, for example).
I would still be very hesitant to monkey-patch the master process, but if you want to do it you may be able to use the server hooks instead of patching gunicorn itself. If you can do that, I would welcome any way we can document that somewhere that's better than here.
I think the short term plan for us is to create an (internal) gunicorn fork with this patch applied.
I'll investigate moving a monkey patch step into our wsgi code, but this still seems like something which should be done inside the gunicorn code (even as a configurable option). Patching in preloaded app code will affect the master one way or another, and the whole point would be to keep the monkey patched state of the workers in sync with the master. Applying the master monkey patch in app code means that app developers who depend on this behavior will be always playing catchup to whatever changes gunicorn makes to when/how it monkey patches the worker.
If it works to patch early, maybe that should be considered as a change.
I can't remember if that was ever the case and if that caused problems, but it's always tricky with monkey patching.
Our own production gunicorn application requires preloading and so also requires monkey-patching in the master (we use Pyramid's/paste's pserve
script to launch the app and load gunicorn from an ini file; we have a wrapper around it to be sure that monkey-patching happens ASAP).
I'm a current gevent maintainer. I'm happy to look at any issues in gevent required to make this scenario work better. We're expecting to drop 1.1 soon-ish and it would be great if this was a supported option.
I tried applying the same monkey patch logic at the top of our wsgi implementation, and we started seeing the DatabaseWrapper
issue. Not entirely sure why though - looking at the wsgi invocation in arbiter.py, that happens a few lines after the point where GeventWorker.setup
will be called (as a side effect of the worker_class assignment. If I'm reading the code right.
I'll followup if I have a eureka or get lucky with regards to the app code, but so far for us, patching earlier in the gunicorn master code is the only fix that solves all our issues.
@jamadden thanks so much.
@jzupko it's usually important that the patching happens before any imports that might be affected. That's why I suggested to do it in a server hook, like by putting a post_fork
function in a gunicorn.conf.py
file. However, I just realized that the application preloading happens before any of the server hooks and it's not clear where one would do that.
@benoitc do we have any mechanism for worker-specific config?
I'm adding this to the release 20 milestone and suggesting that we consider monkey patching the master, maybe, if it can be done without re-introducing problems like #478.
@tilgovi you have the setup
method of the worker that can be used for that I think.
Does anyone have a temporary fix for this until 20.0 is out? Or is the only solution to choose between disabling preload_app
or downgrading to 18.0?
@fletom temporary fix for? Did you try the latest master? Did it works for you?
@benoitc I mean a fix for the "DatabaseWrapper objects" error.
Current gunicorn master doesn't work at all for me actually. On regular 19.6.0 I get this (normal):
[2017-01-05 15:46:32 -0500] [3222] [INFO] Starting gunicorn 19.6.0
[2017-01-05 15:46:32 -0500] [3222] [INFO] Listening at: http://127.0.0.1:8000 (3222)
[2017-01-05 15:46:32 -0500] [3222] [INFO] Using worker: gevent
[2017-01-05 15:46:32 -0500] [3226] [INFO] Booting worker with pid: 3226
[2017-01-05 15:46:32 -0500] [3227] [INFO] Booting worker with pid: 3227
[2017-01-05 15:46:32 -0500] [3228] [INFO] Booting worker with pid: 3228
[2017-01-05 15:46:32 -0500] [3229] [INFO] Booting worker with pid: 3229
And on latest master with the 100% exact same configuration I get this, and nothing listening on any port as far as I can tell:
[2017-01-05 15:47:19 -0500] [3308] [INFO] Starting gunicorn 19.6.0
[2017-01-05 15:47:19 -0500] [3308] [INFO] Listening at: (3308)
[2017-01-05 15:47:19 -0500] [3308] [INFO] Using worker: gevent
[2017-01-05 15:47:19 -0500] [3312] [INFO] Booting worker with pid: 3312
[2017-01-05 15:47:19 -0500] [3313] [INFO] Booting worker with pid: 3313
[2017-01-05 15:47:19 -0500] [3314] [INFO] Booting worker with pid: 3314
[2017-01-05 15:47:19 -0500] [3315] [INFO] Booting worker with pid: 3315
Seems bugged?
@fletom how do you launch gunicorn?
@benoitc In this case, it's gunicorn -c gunicorn_conf.py <appname>.wsgi
.
My gunicorn_conf.py
is:
workers = 4
worker_class = 'gevent'
preload_app = False
Hello,
I'm seeing this error with gunicorn
19.x
(tested with19.0
and19.1.1
) using thegevent
worker. I'm usingdjango-1.7.1
.this is the exception:
Here is my
gunicorn.conf.py
:The error occurs every time, just accessing any URL of the django app (that depends on any model, of course) triggers this error. Any ideas? Since changing to
gunicorn-18.0
solves the problem, I'm assuming thatgunicorn-19.x
is doing something different.If you need any additional information, let me know and I will post it here.
For now I'm using
18.0
.Thanks a lot.