honeycombio / beeline-go

Legacy instrumentation for golang apps with Honeycomb
https://honeycomb.io
Apache License 2.0
74 stars 48 forks source link

feat: Improve speed and reduce allocations when adding fields #402

Closed MikeGoldsmith closed 10 months ago

MikeGoldsmith commented 11 months ago

Which problem is this PR solving?

The cost of adding the app. prefix to all field names is expensive and can account for a significant amount of allocations as observed by perf tools. This is down to always doing a string manipulation for every field that's being added to a span or trace.

Short description of the changes

New benchmarks:

BenchmarkBeelineAddField_PrefixedKey-10         25759432    46.06 ns/op 0 B/op      0 allocs/op
BenchmarkBeelineAddField_ConsistentKey-10       21309750    55.63 ns/op 0 B/op      0 allocs/op
BenchmarkBeelineAddField_InconsistentKey-10     2997980     433.1 ns/op 300 B/op    2 allocs/op

Previous benchmarks (equivalent to consistent key above):

BenchmarkBeelineAddField-10     17576912            70.22 ns/op 8 B/op  1 allocs/op
MikeGoldsmith commented 11 months ago

When testing, a panic happened if AddField was called before Init (or never called). This isn't an expected under normal usage but happened in tests so it's possible and shouldn't cause a panic. I've moved the cache creation to a check within the func that does the lookup, creating the cache if nil.

lizthegrey commented 11 months ago

Unfortunately, this appears to contend the RWMutex inside the LRU cache, making this operation much slower than just doing the RAM allocation :(

cartermp commented 11 months ago

Another approach could be to only add the prefix if it doesn't already exist, and then go through various places in shepherd and update the keys we pass in to include app.. Maybe we can avoid several instances of string allocations that way? I'd also be curious which callers are the ones showing up as contributing the most to memory use, and focusing just on those.

MikeGoldsmith commented 11 months ago

Ah, the lock contention being an issue for high volume make sense.

I agree with @cartermp - likely the best solution is to check if the key already has the app. prefix and only add if not. Then update docs to say a perf. improvement would be to prefix your names to save the allocation.

MikeGoldsmith commented 11 months ago

Also, I found that we already had benchmarks for AddField so I've added the following benchmarks:

Added results to PR description.

kentquirk commented 11 months ago

I did some experimentation here.

kentquirk commented 11 months ago

After further experimentation, it does not seem to be reasonable to have a shared cache that does not have a significant performance bottleneck when trying to limit the size of the cache. (It's probably possible, but not without a lot of tricky code.) And not limiting the cache size might cause a major memory hit in the case of a customer who accidentally uses a high-cardinality value for a column name.

I no longer think this approach is worthwhile; a small number of short-duration allocations seems preferable to the locking problems from a shared cache. I'd suggest that we close this issue.

MikeGoldsmith commented 10 months ago

Introducing a cache is showing to have too much lock contention, closing this in favour of #406.