Open eyberg opened 5 years ago
As far as I can see, this is not node-specific, if I run a Go web server I see the same behavior, when I run ab with a concurrency level above 2 I see "Connection reset by peer" errors. When these happen, by enabling logging at the lwIP level in the kernel I see that there are missing TCP connection request packets, so the VM doesn't even receive the connection requests from the ab client. I believe this is a limitation of Qemu user mode networking: looking at the tcpx_listen() function at https://gitlab.freedesktop.org/slirp/libslirp/-/blob/master/src/socket.c?ref_type=heads#L848 (Qemu uses libslirp to implement user mode networking), there is a listen() call with the backlog argument set to 1, so concurrent TCP connection requests are not guaranteed to succeed. As to why this behavior differs on MacOS compared to Linux, I believe it's due to a different implementation of the listen() syscall: the listen(2) man page on Linux says "The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests", while the man page on MacOS doesn't say anything in this regard.
i didn't test under kvm/linux so I could easily be running into usermode or osx specific issue
it's sporadic on level 4 but always crashes on level 5
just for comparison on localhost we get around 7kreqs/sec w/out crashing
going to just tag this performance for now - we can come back to this later on once we have more stuff fleshed out