Temporary deadlock on daemon socket connection.

GoogleCodeExporter commented 8 years ago

When using mod_wsgi daemon mode and a POST request is received, with the amount 
of content 
being greater in size than the UNIX socket buffer size for the host being used, 
and the 
application doesn't consume the request content before sending a response, and 
the response 
headers plus response content is itself greater than the UNIX socket buffer 
size, then a deadlock 
can occur.

This is because the Apache child process side is still trying to write the 
request content and is 
therefore not in a position to read the response headers and response content 
and unblock the 
daemon side trying to write the response.

For the typical well behaved application this doesn't present a problem. The 
sort of use case 
where it would occur is where specifically doing things like streaming back the 
request content 
as the response content, possibly with modifications. Only actual situation 
where a problem has 
been seen in practice is where SPAM bots are performing POST requests with 
large amounts of 
content against arbitrary URLs. Thus, the SPAM bots are inadvertently 
triggering the problem and 
not normal application usage.

The actual deadlock is not permanent and will end when the timeout specified by 
the Apache 
Timeout directive has elapsed, typically 300 seconds.

Note that how severe the problem is is dictated by size of UNIX socket buffers. 
On some UNIX 
systems this is as low as 8KB (MacOS X). On Linux systems it is a lot higher. 
In mod_wsgi 2.0c5 
an option will exist for WSGIDaemonProcess to allow the send and recv buffer 
sizes for the UNIX 
socket to be increased so where a system has a default low value, it can be 
increased and lesson 
risk of problem occurring.

To totally eliminate this issue means reengineering the protocol used between 
Apache child 
process and daemon process with a packet based mechanism in conjunction with 
flow control 
mechanisms so that daemon can indicate when it is willing to accept more 
request content. How 
this may be done has been discussed extensively on the mod_wsgi mailing list.

BTW, this deadlock issue also exists with mod_cgi, mod_cgid and mod_scgi, as 
these all use a 
similar system based around a UNIX socket or interprocess pipe. Conceptually a 
deadlock 
situation may be triggered with mod_proxy as well, but in that case INET 
sockets are used and 
the buffer sizes on these are much much larger. Still need to determine what 
the critical size 
values for mod_proxy to cause a deadlock is, if it also can suffer this 
problem. It is possible for 
mod_proxy that some other magic happens in Apache to prevent it, but haven't 
been able to 
determine what that is as yet if it does exist.

Original issue reported on code.google.com by Graham.Dumpleton@gmail.com on 18 Feb 2008 at 9:58

GoogleCodeExporter commented 8 years ago

Original comment by Graham.Dumpleton@gmail.com on 18 Feb 2008 at 9:58

Added labels: Milestone-Release3.0

GoogleCodeExporter commented 8 years ago

The request doesn't have to be a POST; any method (including GET) will cause 
problems
if there is a (large) request body.

Also, this problem doesn't affect applications that are not "well behaved." In 
fact,
the more well-behaved the application is (checking request headers for 
validity), the
more likely it is that it will run into the problem. In fact, any request that 
would
benefit from the "100-continue" optimization would run into this problem if the
request body is big enough. My application runs into this problem already, and 
it
will only get worse as I add more features that require me to validate requests 
based
on the headers.

- Brian

Original comment by brian@briansmith.org on 21 Feb 2008 at 12:05

GoogleCodeExporter commented 8 years ago

Now not likely to be addressed in version 3.0.

Original comment by Graham.Dumpleton@gmail.com on 11 Apr 2008 at 5:26

Removed labels: Milestone-Release3.0

GoogleCodeExporter commented 8 years ago

Note that this is not specific to mod_wsgi. WSGI applications running under 
IIS+CGI
face the same issue. (See 'buffer bug' at
http://www.doxdesk.com/updates/2006.html#u20060416-cgi .)

WSGI apps should always make sure to read the input stream (via 
cgi.FieldStorage() or
whatever other method their framework supplies), even if they are only intended 
to be
called through GET methods.

Original comment by bobi...@gmail.com on 5 Nov 2008 at 2:08

GoogleCodeExporter commented 8 years ago

Interesting link about IIS+CGI. Thanks.

As to WSGI applications ensuring they read all input, they nearly always don't 
for the case where they weren't 
expecting to do anything with the input. Can't see that that is going to change.

Original comment by Graham.Dumpleton@gmail.com on 5 Nov 2008 at 9:49

GoogleCodeExporter commented 8 years ago

Isn't this actually a denial of service issue since you'll have deadlocked 
workers
for how much your Timeout is ?

Original comment by ionel...@gmail.com on 27 Nov 2008 at 6:07

GoogleCodeExporter commented 8 years ago

It can be if WSGI application doesn't consume request content and generates 
large responses at same time. Most UNIX systems have 
large UNIX socket buffer sizes though and so would have to be reasonably big 
response. If WGSI applications are correctly rejecting 
POST requests against URLs which aren't expecting them and returning error 
response pages as they should, then wouldn't generally be 
an issue.

This same issue crops up in mod_cgi, mod_cgid and mod_scgi. From analysis of 
code, technically it looks like that for some mod_proxy 
and mod_fastcgi configurations it may also occur, but since INET sockets are 
used and buffers are usually somewhat larger would take 
even large amounts of data in each direction.

Recent discussion at:

  http://groups.google.com/group/python-web-sig/browse_frm/thread/fdd318a722383792

talked a bit about this.

Original comment by Graham.Dumpleton@gmail.com on 27 Nov 2008 at 6:27

GoogleCodeExporter commented 8 years ago

Hello,
I know this thread is a bit dated, but I am having a similar issue.
It seems that apache is locking up on post requests, inconsistently but usually.

I believe my configuration is typical:
Apache v2.2 Worker 64bit threaded
Python v2.6.5
Mod_WSGI v3.2 and v3.3 (They both have locking issues)

I have installed Mod_WSGI 2.8, for now, and it seems to be working fine.
Although I fear that some issues may arise when using some newer python 
packages in the feature, due to the deprecation of Mod_WSGI v2.8.

Is this issue being looked at for feature releases of Mod_WSGI?

Thanks!

Original comment by hayesjor...@gmail.com on 12 Aug 2010 at 5:50

GoogleCodeExporter commented 8 years ago

Oh, I forgot to mention...
Centos 5.4

Original comment by hayesjor...@gmail.com on 12 Aug 2010 at 5:53

GoogleCodeExporter commented 8 years ago

@hayesjordan: If this is a POST request against a handler which is expecting 
POST data, then unlikely to be this issue. This issue would also be quite hard 
to trigger on Linux and is generally only an issue on MacOS X. Even then it can 
be avoided by using mod_wsgi configuration directives to increase buffer sizes.

I suggest you go use the mod_wsgi mailing list and describe your problem 
properly, indicating what WSGI application stack you are using and what third 
party Python packages. Also post what your mod_wsgi configuration is and set 
LogLevel in Apache configuration to at least 'info' if not 'debug' so more 
messages output to Apache error log. Post what all those messages are from 
around when issue occurs.

In general, also read:

http://code.google.com/p/modwsgi/wiki/WhereToGetHelp?tm=6

Original comment by Graham.Dumpleton@gmail.com on 13 Aug 2010 at 3:01

Copterfly / modwsgi

Temporary deadlock on daemon socket connection. #56