This repository serves as an example for pursuing better deserialization performance of the well-known
Protobuf-encoded Prometheus WriteRequest
.
As a baseline, the deserialization implementation generated by prost
takes about 7.3ms to decode a WriteRequest
with 10k timeseries on our test environment.
This repository contains several branches that each did some optimization effort.
step1/reproduce
: reproduce the baselinestep2/repeated_field
: use RepeatedField
to replace Vec
for repeated
field to enable poolingstep3/bytes
: use bytes
to facilitate zero-copy deserializationstep4/bytes-eliminate-one-copy
: eliminate one copy_to_bytes
invocationstep5/bench-bytes-slice
: reproduce the overhead brought by Bytes::slice
step6/optimize-slice
: hack Bytes::slice
while ensuring lifetime guaranteeFinally, we cut the deserialization cost from 7.3ms to 1.6ms.