Closed hlship closed 9 years ago
As a note, when running ab, I don't see any signs of failures:
Server Software: Jetty(9.2.6.v20141205)
Server Hostname: localhost
Server Port: 8080
Document Path: /auth/users
Document Length: 1271 bytes
Concurrency Level: 20
Time taken for tests: 10.003 seconds
Complete requests: 157
Failed requests: 0
Keep-Alive requests: 157
Total transferred: 239268 bytes
HTML transferred: 199547 bytes
Requests per second: 15.70 [#/sec] (mean)
Time per request: 1274.274 [ms] (mean)
Time per request: 63.714 [ms] (mean, across all concurrent requests)
Transfer rate: 23.36 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 1
Processing: 891 1208 213.2 1150 2089
Waiting: 891 1208 213.2 1150 2089
Total: 891 1208 213.3 1150 2089
Percentage of the requests served within a certain time (ms)
50% 1149
66% 1239
75% 1278
80% 1316
90% 1428
95% 1666
98% 1952
99% 2039
100% 2089 (longest request)
The output would be different if the server dropped any connections, or returned any non-2xx responses.
To clarify, the "burst of exceptions" keeps occurring, seemingly forever.
Hi,
Thanks for the report, I ll have a look as soon as I can (probably later today).
Could you provide me a minimal test case? I tried using a simple handler with or without timeout in the go block, using ab, curl, chrome and I couldnt reproduce the issue. Also please try with master.
Thanks for looking at this. I'm at a loss. Yesterday this happened consistently, every time I tested my server using ab. Today everything works smooth as silk. If I can find a way to reproduce it, I'll create a tiny test app.
The error has resurfaced in my application.
I'm working on an example.
https://github.com/hlship/jet-issue-12-example
So far, I can get it to time out under heavy load.
So far, I haven't seen the EofException using my test code, but I do see a point where a heavily loaded server simply stops operating after about 16000 requests. See the project in the link above.
Thanks! I ll have a look.
What kind of -c -n options are you using to trigger the timeouts?
I just tried with various -c and -n, up to 100k requests, another test with concurrency 100, and I can't see the error :(
I'm on Mac OS X, JDK 1.7. Perhaps that makes a difference?
Maybe I should reboot my machine :-)
Seriously, I am going to reboot my Mac and see if that affects my results.
I am on linux so I guess there could be a difference yes.
$ uname -a
Linux sputbox 3.12.6-031206-generic #201312201218 SMP Fri Dec 20 17:20:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
This does sound like an OS level issue, I ll try to investigate more on my side as well.
On heavy load machine with a lot of open connections, maybe the connection limit is the bottleneck than jet or jetty...
yeah, ulimit show unlimited on my machine, I was thinking about this as well (and some other settings of the same kind)
ulimit shows unlimited on my machine as well
ulimit is not a good indicador about it, ulimit -n
is a good indicador (amount of file descriptors allowed) but not the unique. I have an article about it: http://www.niwi.be/2013/03/13/optimizing-server-for-heavy-load-django-application/ (it is for django but almost all is completely independent of application/framework)
Thanks Andrey, it's very informative. I also had in mind revisiting this post http://www.http-kit.org/600k-concurrent-connection-http-kit.html , but yours seems more to the point.
Ok, I was able to reduce it to a very small and easily reproducible case. Please see the example project and follow the notes.
It looks similar to this: https://jira.codehaus.org/browse/JETTY-747
Ok from a quick inspection the outputstream is closed when we try writing to it, which throws. I am still undecided on how I would report this error in async mode (the same can happen with the body channel in chunked transfers). I will probably add a :control channel in the request map that would receive such errors, which would later be replaced by a deferred channel probably (if that ever comes out of c.c.async jira). Something similar to what we have in the websocket client/server.
I have a couple of branches to merge/complete first though (backpressure, flow-control improvements) this will probably come next.
As for the "retrying", we don't do anything like this in Jet at least, I will see if I can spot when/where this happens.
Thanks! We can probably continue with the current state of the code, but may have to bounce our servers occasionally, to clear out this "repeating requests".
On Thu, Feb 5, 2015 at 4:23 AM, Max Penet notifications@github.com wrote:
Ok from a quick inspection the outputstream is closed when we try writing to it, which throws. I am still undecided on how I would report this error in async mode (the same can happen with the body channel in chunked transfers). I will probably add a :control channel in the request map that would receive such errors, which would later be replaced by a deferred channel probably (if that ever comes out of c.c.async jira). Something similar to what we have in the websocket client/server.
I have a couple of branches to merge/complete first though (backpressure, flow-control improvements, this will probably come next.
As for the "retrying", we don't do anything like this in Jet at least, I will see if I can spot when/where this happens.
— Reply to this email directly or view it on GitHub https://github.com/mpenet/jet/issues/12#issuecomment-73038004.
Howard M. Lewis Ship
Creator of Apache Tapestry
The source for Tapestry training, mentoring and support. Contact me to learn how I can get you up and productive in Tapestry fast!
(971) 678-5210 http://howardlewisship.com @hlship
Any progress or workaround on this? This is creeping into our production app (fortunately, still in evaluation mode) forcing us to do occasional restarts. If I can't get resolution to this soon, it will be necessary for us to scrap the async support provided by Jet.
Not at all I am afraid, I am currently traveling and didn't have the time to look into it. I had to do a release today (because of this: https://github.com/eclipse/jetty.project/blob/cfca172dd68846b2c7f30a9a6b855f08da7e7946/advisories/2015-02-24-httpparser-error-buffer-bleed.md) and bumped jetty's dependency in the process, it was lagging behind a few versions. It might be worth trying 0.5.5 and see if it still there. If that's the case I ll get into it asap.
Ths is fixed in 0.5.7
You can now reach the :ctrl key in the request-map, it's a core.async channel that would receive eventual errors and handle them. I tested against your small project, this seems to fix the problem.
actually I ll do another release shortly, I introduced some reflection here and there.
done
My handler returns a core.async channel containing a Ring response map.
I've found that if my client terminates early, I start seeing exceptions in my console:
I discovered this using Apache bench (the ab command), using the -t option. I would guess that after it hits its limit, it just closes the HTTP connection(s).
The UUID (f80d055a-7dc1-4823-899e-b2eee7ae8e73) is assigned to each new request.
In fact, using ab -c 20, I see 19 of these exceptions in my log; I see a burst of 19 of these exceptions, a pause of about 20 seconds, then another burst. The UUIDs (assigned by my code as the request is received by my Ring handler) do not repeat, this means that something inside Jet and/or Jetty is retrying them.
I'm tracking down a bit more about how this happens, but it really looks like it would be nice if this exception could be caught and either ignore, or perhaps written to a channel associated with the original request (i.e., so that I could write asynchronous code that would know how to recover from the response being non-deliverable).