Open mfelsche opened 7 years ago
To make it clear, when i talk about the results that are problematic here i mean that nearly all connections (requests) have been closed with socket read errors on client side. This issue is actually not concerned about performance.
when testing with wrk2 from another machine, the number of successful requests reported by the tool equals the number of connections configured with the limit
parameter to HTTPServer. All further requests seem to run into a socket timeout.
Looking at the wireshark package capture data, i can see that:
An example session wireshark session screenshot is attached:
It also seems it could be related to wrk/wrk2 misbehaving:
aaaaaaaaaand .....
it really is a wrk2 problem. Running it from https://github.com/giltene/wrk2/pull/33 fixes the memory and socket read error issue.
I actually don't know if this issue is worth keeping and safeguarding against malicious clients making the httpserver eating up memory. So i keep it open and leave the decision to you.
Great work with your investigation, it's much appreciated!
I'm going to change the ticket title and add a "needs discussion" tag to this to discuss whether it is an important security issue to keep open (for TCP in general, not just HTTP), and how it might be mitigated.
EDIT by @jemc: This turns out to be due to a bug in
wrk
/wrk2
, but may have security implications for TCP in general, when dealing with a malicious/misbehaving client. Read on to the rest of the comments for more info.I wanted to reproduce the benchmark discussed on the pony user mailing list: https://pony.groups.io/g/user/topic/pony_did_poorly_in_a_recent/5411536
A just slightly modified example of the example HTTPServer is used to return only the string
"Hello world."
and to run withDiscardLog
.A run of wrk with
Connection: close
header set results in > 99% of the requests to result in socket read errors and the RAM of the httpserver process increases rapidly (a 30 s benchmark run resulted in 2.8 GB VIRT RAM) in contrast to a fresh httpserver process with a benchmark withoutConnection: close
where RAM remains at 621MB VIRT during and after the whole benchmark.Example output:
Example output without
Connection: close
:I dug throught the
net
andnet/http
source code to understand the architecture and possible problems of the tcp/http code in pony. I also tried to profile the server but i have not been very successful. Most of the time seems to be spent inepoll
which is fine afaik. But then there is lots of time spent in socket close syscalls but most importantly in gc and alloc activity around the_sessions
set where connections are kept in order to close them. But i am not 100% sure about this.My best guess was the
_sessions
code to be the bottleneck, as it kept references to the tcp-connection actors. Those could only be fully gc'ed after their reference has been removed from the server actor. But removing the code storing a reference to them did not change the results.I guess there is a bug in how tcp connection close is handled but i was not able to find the root cause. That is why i file this issue here.
Operating system: Ubuntu 16.04 on Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz Pony: 0.14.0-0d920b8ff [debug] compiled with: llvm 3.9.1 -- cc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Further observations
CommonLog
instead ofDiscardLog
also didn't change the results._sessions
related code (not storing actor references in a set, so they can be gc'ed before the server actor gets rid of its reference) did not change the resultswrk
side did not change the results.