rcrowley / go-metrics

Go port of Coda Hale's Metrics library
Other
3.43k stars 493 forks source link

Reduces lock-contention in high-performance usage of go-metrics. #232

Closed deckarep closed 6 years ago

deckarep commented 6 years ago

Remove lock contention in go-metrics in a few key places. For all of these changes there is expectation around CPU performance gain but the real benefit is reduced lock contention which shows up as significant bottlenecks in both Go 1.9 and Go 1.10 for high performance services.

  1. gauge_float64.go
    • Removes the mutex lock around the internal value field in favor of atomic ops.
    • Changes the value field to be an uint64 (no precision is lost) and does conversions
    • Utilizes atomic.Load/atomic.Store to reduce lock contention
    • Adds a parallel benchmark
BEFORE:
BenchmarkGuageFloat64Parallel-8     20000000            88.1 ns/op         0 B/op          0 allocs/op
AFTER:
BenchmarkGuageFloat64Parallel-8     50000000            28.8 ns/op         0 B/op          0 allocs/op
  1. ewma.go
    • Removes lock contention around a few fields in favor of atomic ops.
    • For the mutex lock in place this lock is only taken on the first Tick() operation only
    • Adds a parallel benchmark
BEFORE:
BenchmarkEWMAParallel-8     20000000            99.4 ns/op         0 B/op          0 allocs/op
AFTER:
BenchmarkEWMAParallel-8     20000000            74.5 ns/op         0 B/op          0 allocs/op
  1. meter.go
    • Removes lock contention around multiple fields in favor of atomic ops.
    • Mutex locking is left in places where there is not high concurrency such as tickMeters and Stop.
    • Adds a parallel benchmark
BEFORE:
BenchmarkMeterParallel-8     3000000           467 ns/op           0 B/op          0 allocs/op
AFTER:
BenchmarkMeterParallel-8    10000000           169 ns/op           0 B/op          0 allocs/op
mihasya commented 6 years ago

Howdy, thanks for the PR!

I don't have a lot of visibility into golang version adoption. Do you know of any data out there about how many folks might still be on 1.2? I'm reticent to drop support for a version seemingly in order to enable a parallel benchmark without knowing the numbers.

deckarep commented 6 years ago

@mihasya - I don't have numbers...but the unfortunate thing is that 1.2 doesn't allow us to leverage the Parallel Benchmark features...alternatively we can simply remove it...or perhaps there is a build flag to not build that part of the code if we're on 1.2.

I would say though, most if not all large projects with many contributors with Go don't support anything further then 1.5 these days...so 1.2 is really, really behind.

Additionally, this code change doesn't actually prevent anyone from running on 1.2...it just prevents the parallel benchmarks from being compiled.

deckarep commented 6 years ago

One extra note: I'm adding a few more changes right now..so feel free to not review this just yet.

mihasya commented 6 years ago

Oh great point, it's in a _test file, so we'd just not be testing on it. That makes me feel better.

(to be clear, intuitively it seems fine to drop a version of Go from 2012, but I'm just not plugged into the scene enough to know how "up to date" folks are)

deckarep commented 6 years ago

@mihasya - i totally understand. Yes, 1.2 is long considered dead and highly not recommended for anyone still on it due to security updates.

deckarep commented 6 years ago

@mihasya - ok, ready for your review. We tried to be very thorough here since some of our services started to show bottlenecks and were not able to keep up on their processing due to the extensive locking in this library in part. We also discovered other issues unrelated to this library that we also found ways to work around.

Your considering is much appreciated.

deckarep commented 6 years ago

@mihasya - let me know if you're fine with the PR as it is or if you want me to revert the .travis.yaml change to put Go 1.2 back? I think losing the Parallel Bench unit-tests is fine for us so long as we can move forward with merging this change.

mihasya commented 6 years ago

@deckarep no no, I'm totally fine with dropping Go 1.2, just haven't had a chance to look at the more recent changes.

The biggest thing I want to double check is that we have good tests for validating concurrent behavior. It's been a little while since I've looked at the tests we do have in earnest, and I haven't had a chance to carve out some time to review the coverage. Anything that changes how things are calculated makes me a little nervous - messing up metrics output is a super quick way to pitchforks :)

I would say this has a very high chance of getting merged as is because it "looks good" to me, so it would be safe to proceed using this fork if you're comfortable with it and want to move faster internally than I can merge the PR.

mihasya commented 6 years ago

@deckarep due to an unexpected sick day (BOOOO!), I've spent some time exercising my Go muscles and adding some very basic concurrency tests to all the things that have changed. It looks like there's a race in the Meter implementation.

I've cherry-picked the same test onto master, and it doesn't appear to fail. I'm probably going to take a break and nap, but may take a look at fixing the race later if you don't beat me to it.

deckarep commented 6 years ago

@mihasya - ok, I totally understand...I will take a look at it as well. Definitely motivated to get this merged but of course want to make sure it is solid. I really appreciate your thorough review!

deckarep commented 6 years ago

@mihasya - ok, this race was my bad...i forgot to port over one method that has updated behavior on the stop functionality.

I pushed up a branch just now and handled the Mark method appropriately.

Great, great catch...I do not want to cause any regressions in this library so 🙏!

On a side note, the fix actually squeezed out more performance since it removed yet another lock.

mihasya commented 6 years ago

This may be the most high-functioning code review I've ever been part of 😆

Thanks for the quick turn around on the patch. I now feel great about merging it.

deckarep commented 6 years ago

Thank you so much, I totally agree...nailing down that race was most excellent. Hope you feel better and thanks for being an awesome open-source maintainer.

Cheers @mihasya!

dtjm commented 5 years ago

Thank you for this, kind sir! @deckarep

deckarep commented 5 years ago

@dtjm - sure thing! =)