wavefrontHQ / wavefront-sdk-go

Wavefront Core Go SDK
Apache License 2.0
4 stars 21 forks source link

Feature Request: support parallel http requests #97

Open LukeWinikates opened 2 years ago

LukeWinikates commented 2 years ago

Hi Friends,

My team uses wavefront-sdk-go via the telegraf output. We found that using the telegraf output to send metrics over http, we are limited to a single http request at a time. We knew the Wavefront Proxy could handle much more throughput than this, so we found a way to parallelize http requests by creating multiple instances of the wavefront output in our telegraf config. This works fine, but the telegraf configuration is complicated and has a modest overhead that we would like to avoid.

I think it would be interesting to support a worker pool model for wavefront-sdk-go directly. We can default the pool size to 1, but allow parallelization up to a reasonable threshold.

I imagine that we would add something like a MaxParallelHttpRequests field to the configuration struct to allow users to set this value, and then add some reasonable implementation of channels and goroutines to reporter.go in order to make the parallelization happen.

I wanted to gauge what other contributors to the sdk would think about this feature. Thinking of @keep94 and @oppegard, and I wanted to get @jbau's take as well if he has one.

Thanks in advance for your time and attention 🙇‍♂️🙌💻

keep94 commented 2 years ago

Luke, your idea of parallel http requests sounds interesting. I just wonder how that will work with the Flush() function of a sender object. When you create a sender, you can effectively turn off autoflushing, send a bunch of metrics and spans which get buffered and flush manually. Flush() returns an error indicating whether or not everything buffered up was sent successfully. This sort of use of the API implies just one goroutine using the sender, so I am wondering how a pool of http connections would work. Are you possibly thinking of rewriting flush to use multiple http connections to do the flushing?

Regarding releasing a v1, I feel that there are several artifacts in the public API that shouldn't be exposed. For instance, I think the functions that convert metrics to strings to be sent over the wire could be made private.

Back to the parallel http requests, if Flush() were rewritten to use multiple http connections, it would still have to block until that last http connection finished sending its portion of metrics/traces because by contract Flush() has to return an error indicating whether or not everything was successfully sent.

LukeWinikates commented 2 years ago

I'll see if I can put together a proof of concept for this soon. I'd like to make sure it doesn't introduce any breaking changes, so that it can be an added feature for a v1.x release.

oppegard commented 1 year ago

@LukeWinikates: an HTTP thread/goroutine pool to allow parallel requests would be great. Please work with @keep94 on the specifics.

LukeWinikates commented 9 months ago

I've been looking at the mutex and atomic integer usage in the SDK. As written now, I don't believe that parallel requests can happen, but the Flush method already uses a mutex - I think this is partly because parallel request support was a goal at one point, and partly proactive defense to protect SDK users from getting into a weird state if they call flush from multiple goroutines.

I'm still interested in attempting a WaitGroup implementation, and I wonder if we'd be able to reduce or eliminate the use of sync/atomic or sync.Mutex in favor of using channels and goroutines for synchronizing state.