The primary change is using a fixed size buffer allocation which doubles
the speed on OSX-x64, and a 50% speedup on Linux-x64
I did some benchmarks and CPU profiles and I was looking for where a lot of time was spent, and more time than I would have expected was spent in encodeVarint, and the profiler showed the huge majority of that was allocating a byte slice for it.
I wrote a benchmark and then changed it to use a byte array, the speedups looked like:
// OSX-x64 before buffer change
BenchmarkEncodeVarint 20000000 62.6 ns/op
// OSX-x64 after buffer change
BenchmarkEncodeVarint 50000000 33.9 ns/op
// Linux-x64 before buffer change
BenchmarkEncodeVarint 20000000 99.2 ns/op
// Linux-x64 after buffer change
BenchmarkEncodeVarint 20000000 63.7 ns/op
So not bad, 2x speedup on host machine and 1.5x on linux machine... at least for x86-64 this seems a win. Out of curiosity, I also made a 32-bit variant to see if it would be any faster... and it was only about 5% faster on my OSX host so I was about to throw it way, except I went back to my linux VM, and sure enough, linux saw a noticeable speedup:
Given that 32-bit ints used to go through the same mechanism , that means that encoding a 32-bit int is now almost 3x as fast as it used to be on Linux.. so I'm keeping the 32-bit variant as well.
The primary change is using a fixed size buffer allocation which doubles the speed on OSX-x64, and a 50% speedup on Linux-x64
I did some benchmarks and CPU profiles and I was looking for where a lot of time was spent, and more time than I would have expected was spent in encodeVarint, and the profiler showed the huge majority of that was allocating a byte slice for it.
I wrote a benchmark and then changed it to use a byte array, the speedups looked like:
So not bad, 2x speedup on host machine and 1.5x on linux machine... at least for x86-64 this seems a win. Out of curiosity, I also made a 32-bit variant to see if it would be any faster... and it was only about 5% faster on my OSX host so I was about to throw it way, except I went back to my linux VM, and sure enough, linux saw a noticeable speedup:
Given that 32-bit ints used to go through the same mechanism , that means that encoding a 32-bit int is now almost 3x as fast as it used to be on Linux.. so I'm keeping the 32-bit variant as well.