HdrHistogram / hdrhistogram-go

A pure Go implementation of Gil Tene's HDR Histogram.
MIT License
429 stars 64 forks source link

`ValueAtPercentile()` 4.5X on-cpu time optimization: remove expensive condition checks and re-use computation on hotpaths #48

Closed filipecosta90 closed 2 years ago

filipecosta90 commented 2 years ago

Summary of optimizations

Benchmark Baseline ns/op Comparison ns/op % Improvement Improvement factor
BenchmarkHistogramValueAtPercentile-8 31896 7569 76.27% 4.2
BenchmarkHistogramValueAtPercentileGivenPercentileSlice-8 159894 33900 78.80% 4.7
image

Detail of analysis/changes

Looking at the baseline CPU time by function in the following manner:

go test -bench=BenchmarkHistogramValueAtPercentile -test.cpuprofile=BenchmarkHistogramValueAtPercentile-Baseline.txt -test.benchtime=10s
pprof -web BenchmarkHistogramValueAtPercentile-Baseline.txt

We can observe that the iterator nextCountAtIdx() is the responsible for the majority of the CPU time. ( Even after improving the percentile calculation on the latest release as showcased in #46 ).

image

Doing the same analysis by line of code as follow:

go test -bench=BenchmarkHistogramValueAtPercentile -test.cpuprofile=BenchmarkHistogramValueAtPercentile-Baseline.txt -test.benchtime=10s
pprof -web -lines BenchmarkHistogramValueAtPercentile-Baseline.txt

We can observe that the top consuming LOC are:

image

Looking further at hotspots we can also check that getCountAtIndex is also a good candidate for optimization (on hdr.go#L640 takes 13% CPU time).

image

Even though we can't remove this call, we can reduce the amount of duplicate computation within it -- specifically on the inner calls to the calculation of bucketBaseIdx that don't change during the time we iterate on each bucket sub-buckets. With that in mind, we've introduced getCountAtIndexGivenBucketBaseIdxand only calculate the bucketBaseIdx on the iteration that change bucket ( meaning no wasted computation on sub-bucket flows ).

Impact of the above optimizations

Following up on all the we've moved from a baseline of:

(base) fco@fcos-Air hdrhistogram-go % go test -bench=BenchmarkHistogramValueAtPercentile -test.benchtime=10s
goos: darwin
goarch: arm64
pkg: github.com/HdrHistogram/hdrhistogram-go
BenchmarkHistogramValueAtPercentile-8                             352322             31896 ns/op               0 B/op          0 allocs/op
BenchmarkHistogramValueAtPercentileGivenPercentileSlice-8          83769            159894 ns/op               0 B/op          0 allocs/op
BenchmarkHistogramValueAtPercentilesGivenPercentileSlice-8        244653             53997 ns/op             248 B/op          4 allocs/op
PASS
ok      github.com/HdrHistogram/hdrhistogram-go 40.829s

to the new optimized ValueAtPercentile / ValueAtPercentileGivenPercentileSlice:

(base) fco@fcos-Air hdrhistogram-go % go test -bench=BenchmarkHistogramValueAtPercentile  -test.benchtime=10s 
goos: darwin
goarch: arm64
pkg: github.com/HdrHistogram/hdrhistogram-go
BenchmarkHistogramValueAtPercentile-8                            1671085              7569 ns/op               0 B/op          0 allocs/op
BenchmarkHistogramValueAtPercentileGivenPercentileSlice-8         335668             33900 ns/op               0 B/op          0 allocs/op
BenchmarkHistogramValueAtPercentilesGivenPercentileSlice-8        229405             51147 ns/op             248 B/op          4 allocs/op
PASS
ok      github.com/HdrHistogram/hdrhistogram-go 45.063s
codecov-commenter commented 2 years ago

Codecov Report

Merging #48 (8dc0092) into master (7a2c58a) will increase coverage by 0.50%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   76.91%   77.41%   +0.50%     
==========================================
  Files           6        6              
  Lines         693      704      +11     
==========================================
+ Hits          533      545      +12     
+ Misses         95       94       -1     
  Partials       65       65              
Impacted Files Coverage Δ
hdr.go 90.63% <100.00%> (+0.70%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7a2c58a...8dc0092. Read the comment docs.