bloomberg / amqpprox

An AMQP 0.9.1 proxy server, designed for use in front of an AMQP 0.9.1 compliant message queue broker such as RabbitMQ.
Apache License 2.0
78 stars 16 forks source link

Workaround asio::ssl async_read_some busy-loop #69

Closed adamncasey closed 2 years ago

adamncasey commented 2 years ago

asio::ssl has a bug where calling async_read_some with null_buffers isn't correctly handled, and more or less immediately invokes the provided handler with no data. This causes amqpprox to busy-loop whenever a TLS socket has been accepted or opened.

There were two options for how to address this in our code:

1) Always read with a fixed size buffer, such as 32kb. This would simplify the code slightly, at the expense of needing multiple reads to handle larger than 32kb frames in non-TLS mode, even when the full frame is available to amqpprox in one go. 2) Ask asio::ssl to read with a very small buffer, then ask the openssl library how many bytes are available. This technique aligns with how amqpprox's read loop works today. That is what is implemented here.

In theory something similar could be upstreamed into asio::ssl. It's a little tricky though and this exact code couldn't handle the generic MutableBufferSequence interface - we can take some shortcuts in our code.

I've done some benchmarking to check this change isn't going to regress performance noticeably. Data throughput tests indicate that this fix improves performance for TLS connections over the existing code.

Performance Tests

Testing by setting up amqpprox as per the performance tests README.md, and running the perf tester like:

$ ./amqpprox_perf_tester --address amqp://localhost:30671 --listen-address 0.0.0.0:5671 --clients 10 --max-threads 10 --message-size 10000000 --num-messages 100 --listen-cert amqpprox/cert.pem --listen-key amqpprox/key.pem

Results in MiB/s over a ~20second run

Test # no TLS no TLS no TLS TLS TLS TLS
Code Current Fixed Buffer 32K This PR Current Fixed Buffer 32K This PR
1 898 1038 824 426 515 505
2 965 902 1007 470 535 514
3 1050 888 918 475 553 555

This shows there isn't really any chance to non-TLS performance, and there is a slight increase in throughput with this code change (or the fixed-buffer size change).

Still TODO

adamncasey commented 2 years ago

Results for connection throughput testing - as before, I'm really just trying to make sure we don't regress performance here, and comparing against the not-shown fix using a fixed buffer size. The memory usage is fairly adhoc - I just watched top while the tests ran.

Values in connections completed per second over the test run:

Test # TLS TLS TLS
Code Current Fixed Buffer 32K Tiny Buffer +ask
1 928 876 963
2 896 923 963
3 909 860 887
memory 1.3% 2.0% 1.1%

The machine used for testing had 16GB of ram - so assuming top isn't doing something crazy 1.1% is about 180MiB and 2% is about 320MiB.

$ target/release/amqpprox_perf_tester --address 'amqp://127.0.0.1:16001' --listen-address '0.0.0.0:5671' --message-size 1 --num-messages 1 --listen-cert ../../build/Linux/cert.pem --listen-key ../../build/Linux/key.pem --clients 10000 --max-threads 150

Each test run was only about 10 seconds. I hit ephemeral port issues on this host running longer tests - probably can work around this if we want to, but these numbers seemed like a reasonable comparison to me.