Open GoogleCodeExporter opened 8 years ago
FWIW, mod_wsgi doesn't do anything about '100-continue'. Whatever behaviour one
sees is a result of how
Apache lower levels implement it and possible at handler level one doesn't have
much control over it.
My understanding up to now was that Apache doesn't send the '100-continue'
until first attempt by handler to
read data. Problem is that no browser implements '100-continue' how it is meant
to and just sends any content
immediately after the headers anyway.
So, as far as I know Apache does it correctly now.
Original comment by Graham.Dumpleton@gmail.com
on 15 Jan 2008 at 11:07
As followup, mod_wsgi only really honours 100-continue properly when run in
embedded
mode. When in daemon mode the content is always being sent across to daemon even
before WSGI application gets a chance to process request. As such, the 100
status
response is sent by Apache back to the client before wsgi.input in WSGI
application
running in daemon mode had been used.
Will bring this whole issue up on mailing list, as my knowledge on 100-continue
details isn't excellent and nothing I have found yet on the net explains some
things
I want to know. :-(
Original comment by Graham.Dumpleton@gmail.com
on 29 Jan 2008 at 4:44
Actually, in terms of what the WSGI specification says:
"""
Servers and gateways that implement HTTP 1.1 must provide transparent support
for
HTTP 1.1's "expect/continue" mechanism. This may be done in any of several ways:
1. Respond to requests containing an Expect: 100-continue request with an
immediate "100 Continue" response, and proceed normally.
2. Proceed with the request normally, but provide the application with a
wsgi.input stream that will send the "100 Continue" response if/when the
application
first attempts to read from the input stream. The read request must then remain
blocked until the client responds.
3. Wait until the client decides that the server does not support expect/continue,
and sends the request body on its own. (This is suboptimal, and is not
recommended.)
"""
When using embedded mode it does 2. When using daemon mode it effectively does
1.
To my mind doing 2 is the best thing as can avoid the need to send content at
all.
Thus need to get daemon mode doing 2 as well.
Original comment by Graham.Dumpleton@gmail.com
on 29 Jan 2008 at 4:58
Hmmm, more digging. There are problems with 100-continue with how mod_wsgi
avoids
buffering output to enforce WSGI requirement to flush between yields or
iterables.
With the example:
def application(environ, start_response):
length = int(environ.get('CONTENT_LENGTH', '0'))
prefix = str(environ) + '\n'
status = '200 OK'
response_headers = [('Content-Type', 'text/plain'),
('Content-Length', str(length+len(prefix)))]
start_response(status, response_headers)
yield prefix
block = min(128, length)
output = environ['wsgi.input'].read(block)
length -= block
while output:
yield output
output = environ['wsgi.input'].read(block)
length -= block
In this example it deliberately yields a value before making first attempt to
use
wsgi.input. The point of this was to test whether the remote client would still
send
content where final response headers were received before 100 status was
returned and
actual response status was 200.
What happens is that the 100 status response which is generated by Apache output
filters (and not mod_wsgi), gets inserted into output stream. Ie., for:
grahamd$ curl -vF A=B http://localhost:8224/wsgi/scripts/stream.py
one sees:
* About to connect() to localhost port 8224
* Trying ::1... * connected
* Connected to localhost (::1) port 8224
> POST /wsgi/scripts/stream.py HTTP/1.1
User-Agent: curl/7.13.1 (powerpc-apple-darwin8.0) libcurl/7.13.1 OpenSSL/0.9.7i
zlib/1.2.3
Host: localhost:8224
Pragma: no-cache
Accept: */*
Content-Length: 137
Expect: 100-continue
Content-Type: multipart/form-data;
boundary=----------------------------607fea63a6ca
< HTTP/1.1 200 OK
< Date: Tue, 29 Jan 2008 05:32:16 GMT
< Server: Apache/2.2.4 (Unix) mod_wsgi/2.0c4 Python/2.3.5
< Content-Length: 1799
< Content-Type: text/plain
{'mod_wsgi.reload_mechanism': '0', 'mod_wsgi.listener_port': '8224',
'SERVER_SOFTWARE': 'Apache/2.2.4 (Unix) mod_wsgi/2.0c4 Python/2.3.5',
'SCRIPT_NAME':
'/wsgi/scripts/stream.py', 'mod_wsgi.handler_script': '', 'SERVER_SIGNATURE':
'<address>Apache/2.2.4 (Unix) mod_wsgi/2.0c4 Python/2.3.5 Server at localhost
Port
8224</address>\n', 'REQUEST_METHOD': 'POST', 'PATH_INFO': '', 'SERVER_PROTOCOL':
'HTTP/1.1', 'QUERY_STRING': '', 'PATH':
'/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/ose/bin:/usr/local/bin:/Users/grahamd/
bin',
'CONTENT_LENGTH': '137', 'HTTP_USER_AGENT': 'curl/7.13.1
(powerpc-apple-darwin8.0)
libcurl/7.13.1 OpenSSL/0.9.7i zlib/1.2.3', 'SERVER_NAME': 'localhost',
'REMOTE_ADDR':
'::1', 'wsgi.url_scheme': 'http', 'mod_wsgi.output_buffering': '0',
'mod_wsgi.callable_object': 'application', 'SERVER_PORT': '8224',
'wsgi.multiprocess': True, 'SERVER_ADDR': '::1', 'DOCUMENT_ROOT':
'/usr/local/apache-2.2.4/htdocs', 'mod_wsgi.process_group': '', 'HTTP_PRAGMA':
'no-cache', 'SCRIPT_FILENAME': '/usr/local/wsgi/scripts/stream.py',
'SERVER_ADMIN':
'you@example.com', 'wsgi.input': <mod_wsgi.Input object at 0x4de480>,
'HTTP_HOST':
'localhost:8224', 'wsgi.multithread': True, 'HTTP_EXPECT': '100-continue',
'REQUEST_URI': '/wsgi/scripts/stream.py', 'HTTP_ACCEPT': '*/*', 'wsgi.version':
(1,
0), 'GATEWAY_INTERFACE': 'CGI/1.1', 'wsgi.run_once': False, 'wsgi.errors':
<mod_wsgi.Log object at 0x489170>, 'REMOTE_PORT': '56725',
'mod_wsgi.listener_host':
'', 'CONTENT_TYPE': 'multipart/form-data;
boundary=----------------------------607fea63a6ca',
'mod_wsgi.application_group':
'kundalini.local:8224|/wsgi/scripts/stream.py', 'mod_wsgi.script_reloading':
'1'}
HTTP/1.1 100 Continue
------------------------------607fea63a6ca
Content-Disposition: form-data; name="A"
B
* Connection #0 to host localhost left intact
* Closing connection #0
Note the presence of 'HTTP/1.1 100 Continue' intermixed in response. Apache
should be
realising that headers have been sent and not generating this. Not sure if
Apache is
wrong or whether how mod_wsgi uses it is wrong. If one turns on
WSGIOutputBuffering
in mod_wsgi the problem doesn't exist.
Original comment by Graham.Dumpleton@gmail.com
on 29 Jan 2008 at 5:35
mod_wsgi has to send the response headers once the application yields its first
non-empty string. When you send the headers before calling
ap_should_client_block,
the meaning is "do not continue;" that is, "do not send a request body. If the
client
sends a request body anyway, then the server should ignore it, according to
[http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.3 HTTP 1.1
section 8.2.3].
Effectively, the way WSGI is defined, an application must call
environ["wsgi.input"].read() or .readline() at least once before yielding an
iterable, if it wants to read the input at all when an Expect: 100-continue is
provided. If there is an "100-continue" in the "Expect" header, and the
application
yields a non-empty string before reading from wsgi.input, mod_wsgi should
(seemingly
must) disable wsgi.input, preferably by raising an exception whenever the user
tries
to read from it.
I will bring it up on Web-SIG.
Original comment by brian@briansmith.org
on 29 Jan 2008 at 6:17
Issue in comment 4 about 100 Continue being returned in response was fixed for
2.0, but still nothing done about comment 5 and whether to force generate 100
Continue before headers if no attempt made to read input before response is
generated, this being what was discussed on Web-SIG.
Interestingly, if using Apache 2.2.6, Apache seems to do this automatically.
Ie.,
curl -v -F xxx=yyy http://127.0.0.1/~grahamd/echo.wsgi
* About to connect() to 127.0.0.1 port 80 (#0)
* Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> POST /~grahamd/echo.wsgi HTTP/1.1
> User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3
OpenSSL/0.9.7l zlib/1.2.3
> Host: 127.0.0.1
> Accept: */*
> Content-Length: 141
> Expect: 100-continue
> Content-Type: multipart/form-data;
boundary=----------------------------6ff45c500399
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Date: Mon, 18 Feb 2008 09:22:29 GMT
< Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7l DAV/2
mod_wsgi/2.0c5-TRUNK Python/2.5.1
< Content-Length: 1695
< Content-Type: text/plain
<
< .....
Where echo.wsgi was:
import StringIO
def application(environ, start_response):
headers = []
headers.append(('Content-type', 'text/plain'))
print >> environ['wsgi.errors'], environ
#environ['wsgi.input'].read(0)
start_response('200 OK', headers)
input = environ['wsgi.input']
output = StringIO.StringIO()
keys = environ.keys()
keys.sort()
for key in keys:
print >> output, '%s: %s' % (key, repr(environ[key]))
print >> output
length = int(environ.get('CONTENT_LENGTH', '0'))
output.write(input.read(length))
return [output.getvalue()]
Haven't found in the code yet for Apache 2.2.6 where it is doing this and how
code is different to Apache 2.2.4 where it doesn't do it. Other possibility is
my
Apache 2.2.4 configuration is somehow different. Doesn't do it for Apache 1.3
either though, so maybe Apache was changed.
Original comment by Graham.Dumpleton@gmail.com
on 18 Feb 2008 at 9:28
Original comment by Graham.Dumpleton@gmail.com
on 18 Feb 2008 at 9:33
Graham, your test application doesn't do what you think it does; it needs to
yield a
non-empty string to flush the headers before attempting to read from
wsgi.input. If
Apache had already sent the headers before you read from wsgi.input then it
would not
be able to set the Content-Length header in the response.
Original comment by brianlsm...@gmail.com
on 19 Feb 2008 at 4:49
See also http://issues.apache.org/bugzilla/show_bug.cgi?id=38014. The handling
of
"100 Continue" has improved in very recent versions; in particular, the fixed
versions of httpd will no longer incorrectly send a "100 Continue" if headers
have
already been sent.
Original comment by brianlsm...@gmail.com
on 19 Feb 2008 at 4:52
Okay, that would help. I thought I was going mad. Don't understand then why I
was
seeing Apache 2.2.4 do something different. Only thing I can think of is that
curl on
the older MacOS X version where am running Apache 2.2.4 does something
different.
That or my test program was different, given they were in different boxes.
Anyway,
know now to go back and do text from scratch.
Are we still more or less agreed though, as per Web-SIG discussions as I
understood
it, that for 2xx and 3xx responses should force the 100 continue response if no
input read before start_response(). Presume that it would be forced only at
time that
headers are flushed. Ie., first call to write() or first non empty value from
iterable?
Original comment by Graham.Dumpleton@gmail.com
on 19 Feb 2008 at 4:56
As to:
Index: modules/http/http_filters.c
===================================================================
--- modules/http/http_filters.c (revision 512953)
+++ modules/http/http_filters.c (working copy)
@@ -185,7 +185,8 @@
* Only valid on chunked and C-L bodies where the C-L is > 0. */
if ((ctx->state == BODY_CHUNK ||
(ctx->state == BODY_LENGTH && ctx->remaining > 0)) &&
- f->r->expecting_100 && f->r->proto_num >= HTTP_VERSION(1,1)) {
+ f->r->expecting_100 && f->r->proto_num >= HTTP_VERSION(1,1) &&
+ !(f->r->eos_sent || f->r->bytes_sent)) {
char *tmp;
apr_bucket_brigade *bb;
That would indeed fix Apache.
The question is, will my simply setting expecting_100 to false as a way of
making it
work properly for older versions of Apache as well cause issues with output
filters.
Frankly can't think of any reason why any other output filter would at that
point be
interested in expecting_100.
Original comment by Graham.Dumpleton@gmail.com
on 19 Feb 2008 at 5:00
I would send a 100 continue for any status code that isn't 4xx or 5xx, except I
would
send a 500 Internal Server error if the status code is 100. That is basically
what
you said, except that it allows for codes > 599 and less than 200.
Original comment by brianlsm...@gmail.com
on 19 Feb 2008 at 5:01
I think it would be better to find out which released version of Apache
contains the
fix above, and then only do the workaround for versions before that.
Original comment by brianlsm...@gmail.com
on 19 Feb 2008 at 5:05
Send a 500 error just for 100, or any 1xx values. Can a WSGI application validly
generate any 1xx status values?
Original comment by Graham.Dumpleton@gmail.com
on 19 Feb 2008 at 5:14
Other 1xx codes are reasonable (WebDAV's "102 Processing" might even be useful
for
quite a few WSGI applications), but there is no way for a WSGI application to
generate the next status code (no way to send "102 Processing" and then "200
OK"), so
I guess it is reasonable to prevent the WSGI application from sending them.
This is
an issue that should be brought up on Web-SIG.
Original comment by brianlsm...@gmail.com
on 19 Feb 2008 at 5:22
The question in my mind at this point though is whether it really should be
generating this 100 continue if no content read before headers sent.
The problem as I see it is that even if mod_wsgi does this, other WSGI hosting
mechanisms aren't, so no one would be able to rely on it and so to make their
application portable they would have to try a zero length read of wsgi.input
anyway,
and hope the WSGI hosting solution doesn't optimise away the zero length read
and
thus not pass it done to the lower layers.
Is there any evidence of any other web framework system, be it Python or some
other
language, taking this stance and automatically generating a 100 continue even
if no
content read before data sent in response?
Original comment by Graham.Dumpleton@gmail.com
on 20 Feb 2008 at 11:42
I would be surpised if you could find a WSGI gateway that *doesn't* send "100
continue" when working behind Apache. mod_cgi, mod_fastcgi, mod_fcgi,
mod_proxy_*,
and even most versions of mod_wsgi will always send a "100 continue" because
they
read the request body unconditionally before the application even has a chance
to
send any headers.
What you would actually be doing is providing a useful optimization where you
*don't*
send an unnecessary "100 continue" for 4xx and 5xx responses, instead of
sending it
every time. Keep in mind that, technically, you can *always* send a
100-continue,
even without a "Expect: 100-continue" in the request. 1xx responses are always
allowed.
Original comment by brian@briansmith.org
on 20 Feb 2008 at 11:58
Paste server appears to try to implement:
"""2. Proceed with the request normally, but provide the application with a
wsgi.input stream that will send the "100 Continue" response if/when the
application
first attempts to read from the input stream. The read request must then remain
blocked until the client responds."""
I agree that all the others that I have seen do:
""" 1. Respond to requests containing an Expect: 100-continue request with an
immediate "100 Continue" response, and proceed normally."""
Original comment by Graham.Dumpleton@gmail.com
on 21 Feb 2008 at 12:06
If you proxy Paste Server behind Apache, the 100 continue gets sent
immediately, as
far as I remember.
Original comment by brian@briansmith.org
on 21 Feb 2008 at 12:37
Yes, if mod_proxy is used that is always true no matter what the back end does.
And since Pylons people would most likely say that running Pylons with mod_wsgi,
rather than behind mod_proxy, is evil and therefore forbidden, then one could
say
that that is the default behaviour for Pylons. ;-)
Original comment by Graham.Dumpleton@gmail.com
on 21 Feb 2008 at 12:44
Am thinking to make this easier to implement that I remove support for
WSGIOutputBuffering. If people want buffering they should do it themselves in
WSGI
application anyway.
Original comment by Graham.Dumpleton@gmail.com
on 21 Feb 2008 at 1:01
I think that is a good idea. I already had to disable support for
WSGIOutputBuffering
a long time ago in my modified version, in order to support the
file-descriptor-passing file_wrapper.
Original comment by brian@briansmith.org
on 21 Feb 2008 at 1:05
I didn't strictly need to get rid of output buffering as in the end wouldn't
have
caused a problem, but have got rid of it anyway. Have updated code to flush '100
Continue' response before headers if required, but just check for 2xx and 3xx
rather
than doing inverse of 1xx, 4xx and 5xx. If HTTP changes in the future then will
worry
about additional status response ranges then, but probably highly unlikely that
will
happen.
Remaining point on this issue is improving on how data is sent across to daemon
process so that can implement 100-continue across that gap and skip sending
content
to daemon when not required. Although this is an enhancement rather than bug,
have
left this flagged as bug report for time being. Have flagged the issue now as
being
for mod_wsgi version 3.0.
Original comment by Graham.Dumpleton@gmail.com
on 22 Feb 2008 at 3:20
That doesn't work if an application is using its own status codes >=600, but I
guess
anybody that does will be getting what they deserve.
Original comment by brian@briansmith.org
on 22 Feb 2008 at 3:59
Now not likely to be fully addressed in version 3.0.
Original comment by Graham.Dumpleton@gmail.com
on 11 Apr 2008 at 5:26
As discussed in:
http://groups.google.com/group/modwsgi/browse_frm/thread/815fd4da49951e72
the workaround to trigger a zero length read before generating response headers
to force the '100 Continue' header to be sent,
causes an assertion failure when Apache is compiled in maintainer mode.
To limit where this can occur and because '100 Continue' bug fixed in Apache
2.2.7 (2.2.8 really as 2.2.7 never an official release),
now only force the zero length read for buggy versions of Apache.
This change was done at revision 1090 in trunk for mod_wsgi 3.0.
Note that a WSGI application could itself still perform a zero length read and
cause the assertion failure as don't yet ignore a zero
length read. Still looking at whether should ignore a zero length read and
whether it is reasonable that a WSGI application could
expect to do a zero length read and see a '100 Continue' response be generated
when no non zero length read already done. Could
perhaps limit this and ignore zero length read if a non zero length read
already done, as certainly no point in that case.
Original comment by Graham.Dumpleton@gmail.com
on 14 Oct 2008 at 11:36
Now also only allow zero length read to propagate down to input filter stack if
that zero length read is first read
from input. Change made in revision 1093.
Original comment by Graham.Dumpleton@gmail.com
on 20 Oct 2008 at 7:05
Original issue reported on code.google.com by
brianlsm...@gmail.com
on 15 Jan 2008 at 8:24