yarpc / yarpc-go

A message passing platform for Go
MIT License
401 stars 101 forks source link

use pooled buffer to read response bytes #2232

Closed saurabhagrawal-86 closed 6 months ago

saurabhagrawal-86 commented 8 months ago

Why? This PR aims to reduce the Bytes allocated by the yarpc client for each outbound tchannel request.

How? The existing tchannel transport implementation reads the response byte stream into a new bytes.Buffer object. This results in a new []byte being allocated on each invocation. We can reduce the bytes allocated per operation by using a buffer pool. The buffer is returned to the pool when the tchannel response body is closed. In the case of the thrift encoding, the response body is closed after it has been deserialized.

I've added a new benchmark to measure the impact of this change. The results are attached below. We can see a 45% reduction in B/op and a small (~7%) gain in CPU.

The results with the latest version (v1.31) of thrfitrw

pkg: go.uber.org/yarpc/encoding/thrift
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkThriftClientCallNormalDist/with_buffer_pool-12                     8912            131850 ns/op           18810 B/op        170 allocs/op
BenchmarkThriftClientCallNormalDist/without_buffer_pool-12                  9590            122995 ns/op           34986 B/op        172 allocs/op
PASS

With older version (v1.29.2) of thriftrw, even though the gains appear relatively smaller, notice that the absolute change in B/op remains the same ~16K.

pkg: go.uber.org/yarpc/encoding/thrift
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkThriftClientCallNormalDist/with_buffer_pool-12                 7771        145647 ns/op       30784 B/op        173 allocs/op
BenchmarkThriftClientCallNormalDist/without_buffer_pool-12              7011        148983 ns/op       46625 B/op        176 allocs/op
PASS
CLAassistant commented 8 months ago

CLA assistant check
All committers have signed the CLA.

jronak commented 8 months ago

Is this pull request ready for review? I see commented lines in the code. Can you update the description of what's being changed and why? If there are any benchmarks