Open NullHypothesis opened 1 year ago
What was the baton
command line? Concurrency level made a significant difference to throughput in my tests. We should make sure we're measuring the tunnel and proxy's capacity and not just the latency.
Yesterday, I measured requests per second (for a simple "hello world" Web server) for an increasing number of baton threads:
All setups can sustain more reqs/sec as the number of sender threads increases—except when we use nitriding's reverse proxy, which sees a reduction in reqs/sec. Some time this week, I'll take a closer look at Go's reverse proxy implementation to see what easy improvements we can make.
Elaborating on the above: The "Enclave" setup constitutes the approximate upper limit that we can achieve with nitriding. This setup has no nitriding: it consists of a Web server that binds directly to the VSOCK interface, and a custom baton that sends requests directly to the VSOCK interface.
At this point, there are two significant bottlenecks:
I stumbled upon an issue that describes the problem we're seeing: https://github.com/golang/go/issues/6785. Increasing MaxIdleConnsPerHost
makes a significant difference. In a preliminary test, I set it to 1000, which makes the reverse proxy perform almost identically to the "no reverse proxy" setup:
For posterity, a few other things I've tried:
BufferPool
for the reverse proxy. Go's reverse proxy implementation is allocating a 32 KB buffer for each incoming request. A buffer pool allows for the reuse of buffers and also mitigates the garbage collector's work. Unfortunately, this made no difference in the numbers. Regardless, it's probably a good idea to add this.For the record, we just merged PR https://github.com/brave/nitriding/pull/61, which improves the status quo.
I've been working on some tooling that can help us measure nitriding's networking performance. So far, I have a minimal Go Web server that implements a simple "hello world" handler. I tested the Web server in three scenarios:
All three scenarios use HTTP only, to eliminate the computational overhead of TLS. I then used baton to measure the requests per second that the Web service can sustain. The results are:
The numbers aren't great. Let's use this issue to do some debugging, identify bottlenecks, and improve the networking code.