honeycombio / libhoney-rb

Ruby library for sending data to Honeycomb
Apache License 2.0
11 stars 30 forks source link

Use a persistent connection per thread instead of connecting anew for each event #27

Closed samstokes closed 6 years ago

samstokes commented 6 years ago

Prior to this PR, libhoney-rb is making a brand new HTTPS connection to api.honeycomb.io before sending each event. This is extremely inefficient, incurring both a TCP connection setup and a TLS handshake for each event. Each outbound connection also consumes kernel resources (and continues to do so for around a minute while the connection is in TCP CLOSE_WAIT). Finally this may be causing a problem observed by a potential customer who found that libhoney-rb was exhausting their port address translations.

The "http" gem we are using supports persistent connections using the HTTP/1.1 header Connection: Keep-Alive, so let's use them. Each thread in the sending thread pool (default 10 threads) gets its own persistent connection (*) and sends all events it processes on that same connection.

It's hard to obtain a representative benchmark, but my initial results suggest that for a moderately-loaded, multi-threaded Rails app that is not CPU-limited (i.e. CPU is not pegged at 100% during the test), this reduces the CPU overhead of libhoney-rb by a factor of 2-3, as well as reducing the number of open outbound connections from hundreds to 10.

The http gem takes care of managing the persistent connection. We just receive, and reuse, a persistent Client object, but under the hood it sets up a Connection, detects if it has timed out or been interrupted and reconnects if needed. I've verified this behaviour through local testing and inspection.

(*) Actually, its own set of persistent connections. Because we support specifying the API host per-event, we actually keep a persistent connection per API host. This is a nuance that probably matters to zero people, but it also moves us toward the code structure we'll need to support batching (#1).