felixge / fgprof

🚀 fgprof is a sampling Go profiler that allows you to analyze On-CPU as well as Off-CPU (e.g. I/O) time together.
MIT License
2.81k stars 88 forks source link

Reduce allocations while creating a slice of keys from map and fix a … #8

Closed sabandi closed 3 years ago

sabandi commented 3 years ago

…typo

Benchmark script: https://play.golang.org/p/1Qv2z4W0EMO

Results:

➜  dev-vm go test -v -run=^$ -bench=BenchmarkMapSlice a_test.go --count=20 >> {old,new}.txt
➜  dev-vm benchstat old.txt new.txt                                                                  
name        old time/op    new time/op    delta
MapSlice-8    1.29µs ± 4%    0.68µs ± 6%  -47.57%  (p=0.000 n=20+20)

name        old alloc/op   new alloc/op   delta
MapSlice-8    1.01kB ± 0%    0.51kB ± 0%  -49.21%  (p=0.000 n=20+20)

name        old allocs/op  new allocs/op  delta
MapSlice-8      6.00 ± 0%      1.00 ± 0%  -83.33%  (p=0.000 n=20+20)
sabandi commented 3 years ago

Thanks @felixge for the encouragement🙂. I will keep reading the hotpath code multiple times to see if I can find any optimization there. So far it looks like you didn't leave any room there for optimizations. 🙂

felixge commented 3 years ago

@sabandi yeah, my recent optimizations definitely picked all the low hanging fruits there I think. To make it even faster will probably require hacking on the go runtime itself 🙈.

Anyway, I suppose there are still functional things left to do, e.g. exploring if the raw data could be streamed out, rather than just the aggregation at the end. This might allow integrating with tools such as FlameScope.

sabandi commented 3 years ago

sure @felixge will try to understand more about FlameScope and how it can be integrated.

felixge commented 3 years ago

@sabandi cool : ). Happy to review & discuss stuff if you find the time for it. But no worries if not, I know life can be busy : )