Netflix-Skunkworks / spectatord

A high performance metrics daemon
Apache License 2.0
24 stars 5 forks source link

Re-Use Connections When Publishing to the Aggregator #51

Open copperlight opened 2 years ago

copperlight commented 2 years ago

This should reduce the overhead per request.

copperlight commented 11 months ago

The Registry:

The Publisher:

https://curl.se/libcurl/c/libcurl-tutorial.html

After each single curl_easy_perform operation, libcurl will keep the connection alive and open. A subsequent request using the same easy handle to the same host might just be able to use the already open connection! This reduces network impact a lot.

https://everything.curl.dev/libcurl/connectionreuse

Easy API pool

When you are using the easy API, or, more specifically, curl_easy_perform(), libcurl will keep the pool associated with the specific easy handle. Then reusing the same easy handle will ensure it can reuse its connection.

Multi API pool

When you are using the multi API, the connection pool is instead kept associated with the multi handle. This allows you to cleanup and re-create easy handles freely without risking losing the connection pool, and it allows the connection used by one easy handle to get reused by a separate one in a later transfer.

https://curl.se/libcurl/c/threadsafe.html

You must never share the same handle in multiple threads. You can pass the handles around among threads, but you must never use a single handle from more than one thread at any given time.

https://stackoverflow.com/a/44740820

The libcurl multi interface is a single-core single-thread way to do a large amount of parallel transfers in the same thread. It allows for easy reuse of caches, connections and more. That has its clear advantages but will make it CPU-bound in that single CPU.

Doing multi-threaded transfers will make each thread/handle has its own cache and connection pool etc which will change when they will be useful, but will make the transfers less likely to be CPU-bound when you can spread them out over a larger set of cores/CPUs.

https://curl.se/libcurl/c/libcurl-multi.html

multi interface overview

https://www.onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

Rolling curl?

Given this behavior, it's not quite clear how it is possible to have CLOSE_WAIT connections stacking up as described in #68. Unless the curl_easy_cleanup stuff shuts down any janitor or cleanup work that might occur, if the sockets get into a bad state.

Possible solutions:

copperlight commented 10 months ago

Using the thread_local approach to the CurlHandle in the HttpClient::perform() method, we get the following results.

On the MacOS platform (apple-clang-15), we see two different ASAN errors:

On the Ubuntu Jammy platform:

Marking some of the HttpClients in the tests as thread_local clears some of the ASAN errors. This points to the need to keep a set of independent HttpClients for the thread pool.

Since the Publisher is responsible for calling the HttpClient::GlobalInit and HttpClient::GlobalShutdown methods on Start() and Shutdown(), we should not need any additional handling for curl setup.

In order to control publisher thread access to the HttpClient pool, because the number of batches may exceed the number of threads, and we cannot know how fast each thread will progress, we may need to implement something like an object pool:

https://codereview.stackexchange.com/questions/273467/c-thread-safe-object-pool