hyperium / tonic

A native gRPC client & server implementation with async/await support.
https://docs.rs/tonic
MIT License
9.35k stars 957 forks source link

use internal buffer's Buf impl for codec buffers #1695

Closed ClementTsang closed 1 month ago

ClementTsang commented 2 months ago

Motivation

The internal buffers within the codec EncodeBuf/DecodeBuf implementations contain what looks like more optimized implementations of BufMut/Buf, which we can take advantage of.

Solution

For DecodeBuf, we can reuse BytesMut's copy_to_bytes implementation, we just need to also update the length.

For EncodeBuf, we should be able to just reuse all of BytesMut's implementation of BufMut, which looks a bit simpler than the default implementation.

Out of curiosity, I tried running the benchmarks:

Before:

test chunk_size_100   ... bench:         514 ns/iter (+/- 34) = 1955 MB/s
test chunk_size_1005  ... bench:         340 ns/iter (+/- 29) = 2955 MB/s
test chunk_size_500   ... bench:         378 ns/iter (+/- 22) = 2658 MB/s
test message_count_1  ... bench:         317 ns/iter (+/- 27) = 1593 MB/s
test message_count_10 ... bench:       1,130 ns/iter (+/- 125) = 4469 MB/s
test message_count_20 ... bench:       2,076 ns/iter (+/- 94) = 4865 MB/s
test message_size_10k ... bench:       1,558 ns/iter (+/- 242) = 12843 MB/s
test message_size_1k  ... bench:         454 ns/iter (+/- 24) = 4427 MB/s
test message_size_5k  ... bench:         834 ns/iter (+/- 119) = 12002 MB/s

After:

test chunk_size_100   ... bench:         501 ns/iter (+/- 42) = 2005 MB/s
test chunk_size_1005  ... bench:         328 ns/iter (+/- 38) = 3064 MB/s
test chunk_size_500   ... bench:         363 ns/iter (+/- 24) = 2768 MB/s
test message_count_1  ... bench:         308 ns/iter (+/- 22) = 1639 MB/s
test message_count_10 ... bench:       1,133 ns/iter (+/- 89) = 4457 MB/s
test message_count_20 ... bench:       2,013 ns/iter (+/- 139) = 5017 MB/s
test message_size_10k ... bench:       1,493 ns/iter (+/- 215) = 13402 MB/s
test message_size_1k  ... bench:         438 ns/iter (+/- 31) = 4589 MB/s
test message_size_5k  ... bench:         790 ns/iter (+/- 132) = 12670 MB/s