Open GoogleCodeExporter opened 8 years ago
I did quite some tests with your Win32 version, and indeed all post issues seem
to be solved also for WinXP/32/IE8. Thank you very much for that.
However, it seems that it does not work at all for Linux. The server starts and
opens a port (according to netstat), but it does not seem to handle any request?
There is also no error log. Did you use any special build options for Linux?
Original comment by bel2...@gmail.com
on 26 Jun 2012 at 4:13
Issue confirmed. Is related to the IPv6 code somehow; investigating further
while getting rid of all those pesky GCC warnings too.
<rant>Why can't google code pick up basic email reply to issue messages like
github can - you /have/ to go to the website? Me stupid or g.c. dumb?</rant>
Will message again when fix is available.
Original comment by ger.hobbelt
on 28 Jun 2012 at 1:15
Okay, located the culprit. (It was me.) wasn't the IPv6 code, but the select()s
not getting a correct first argument (off by one error).
Updated the github repo (master branch) and the issue349 hg repo. Tested with
mingw and ubuntu 10.
N.B. there are a few more hairy issues with mongoose connections but those are
rather fringe-or-MSIE-specific and I'd like to close this issue, i.e. see if
this code is acceptable. It's a big improvement on the current state of affair
anyhow.
Backporting isn't always fun. ;-) Besides, the fix for those is another chunk
of edits and I _guess_ Sergey might want to see them separately.
Original comment by ger.hobbelt
on 28 Jun 2012 at 4:35
Successfully tested on WinXP (VC2010), i568 Linux and ARM Linux. Without any
doubt, now it is certainly a huge improvement.
If you run it on WinXP it will stop with “InitializeSRWLock not found in
KERNEL32.dll”, a possible fix is attached.
Original comment by bel2...@gmail.com
on 2 Jul 2012 at 9:01
Attachments:
If anything, this proves the worth of multi-platform testing.
Thanks for the feedback; driving off towards applying the fix like the dark
rider in Ronal Barbaren.
Original comment by ger.hobbelt
on 2 Jul 2012 at 9:27
Grepping through the code and comments, but still can't figure out what is
wrong with the current implementation. Something is definitely broken.
If somebody gives me a summary on what is broken, I'd apprectiate that, thanks!
Original comment by valenok
on 16 Aug 2012 at 10:31
It is a compounded issue; comments #33, #30, #46, #48 would about cover it.
To see and understand what's wrong with the current TCP code, there are two
inroads:
1) run the tests to observe the failures: bel2125's browser Ajax/POST tests and
augmented testclient are included in
https://github.com/GerHobbelt/mongoose/tree/master/test/ajax and
https://github.com/GerHobbelt/mongoose/tree/master/testclient
(easiest is to grab the repo, build and run against your own mongoose)
(WARNING: current GerHobbelt HEAD rev on github does not work; I'm working on
fixing it since the latest merges + RFC2616 sec 4.2 conformance and will give a
holler when it's done; meanwhile, the AJAX tests in /test/ajax do not depend on
any mongoose C code, so should be useful in at least viewing part of the
problem set until then)
2) there are several issues which have been addressed in the mentioned repo's
349 fix branch: first the ones which aren't about 'graceful close' directly:
a) mongoose doesn't take care about situations where 'HTTP keep-alive'
connections receive requests, which include content data, where the server
side, either via callback or CGI, doesn't care about all received content, i.e.
where a response is generated and finalized before all received *content* has
been read. Current mongoose does a bit of fixup when the internal header-fetch
buffer still has some data, but that's flaky in two senses: (1) it doesn't
fetch any later data that's part of the same received content - only mg_read()
until mg_read()->0 would do that, and (2) there's the anomaly when the network
traffic and server 'speed' is such that multiple requests are received at once
from the POV of recv(), i.e. where said header-fetch buffer also contains part
or whole of a *subsequent* request, which the current code will nuke and thus
corrupt the entire keep-alive req/resp chain from thereon out.
b) MSIE doesn't like it *at all* when you, as a server, take the initiative to
close a connection which you just declared 'keep-alive' in your own server
response you sent last. I didn't go and created the 'push back onto the listen
queue' for fun; it's mandatory to prevent mongoose from being a very easy DoS
target while the server code must permit the browser to 'time out' on such
connections: the 'push back onto queue' code ensures that the mongoose threads
can do work on *any* pending and *active* request, i.e. for those requests for
which we know data has been received by our server TCP stack, while very slow
keep-alive connections and MSIE-like browsers don't 'occupy' the server threads
which would have to wait for quite a long time (multiple seconds). The way MSIE
acts, this behaviour is mandatory, anything less has been shown to fail. (Run
as many MSIE clients as you have mongoose worker threads and it b0rks very
quickly.)
Just set the default mongoose config option 'keep-alive' to 'yes' and the
errors will come flying.
c) 'graceful close' isn't just a 'SO_LINGER' and be done with it.
Theoretically, yes, it would, but again, different browsers, different minds,
and there are those which do *not* appreciate it (MSIE again, for one) when you
just SO_LINGER and close: some browsers only recognize a graceful close as one
when *all* *their* *transmitted* data has been 'received': SO_LINGER doesn't do
this, so you need to split it up: you'll have to ensure that you recv()'d all
incoming data for the connection while the 'graceful' timeout tick-tocks down
to zero, and *only* once you've concluded that no more data will be incoming
(or your own 'graceful' timeout has expired) do you proceed to a
so_linger-based close. Of course when your own timeout has expired, you don't
go and SO_LINGER some more but forcibly close the connection anyway as it's
taken too long: that procedure is completely accounted for in the
connection_close logic in https://github.com/GerHobbelt/mongoose :: mongoose.c
These are the big ones that I recall off the top of my head; the mentioned
tests (ajax and testclient) are the reference material for any failure, as both
should pass on any decent web server (which has a similar /echo URI handler for
testclient).
When analyzing, you'll need to test with various networks and clients as some
failure modes only trigger in particular circumstances (e.g. where the mongoose
server is slow enough to have the keep-alive client submit 2 more requests
while mongoose answers the first, in order to trigger the 'cleanup' issue in
(a). This can be very hard to do (happened only randomly for me) so a code
review ~ code flow analysis path might be faster to recognize the error and
validate the fix (which is to mg_read() until is returns 0).
---
I know my code has quite a few edits compared to the current code, but it would
be good to see them merged into the mainline; particularly because additional,
non-trivial work such as full HTTP/1.1 chunked transfer support is built on top.
Aside: as bel2125's testclient is a very good test client as it doesn't play
nice all the the time, there's also the mutual lockup due to buffers being
filled and not flushed: this happens in the scenario when mongoose does not
first collect the entire response (content data) before starting to send the
corresponding response, which happened in the custom '/echo/' handler. Not a
'mongoose per se' issue, but definitely something to keep in mind while working
with the test code. My version of the custom '/echo' handler interleaves
mg_read and mg_write to prevent such a buffer-based lockup from happening.
Original comment by ger.hobbelt
on 16 Aug 2012 at 3:42
Original issue reported on code.google.com by
nullable...@gmail.com
on 26 Apr 2012 at 12:28Attachments: