jhump / protoreflect

Reflection (Rich Descriptors) for Go Protocol Buffers
Apache License 2.0
1.35k stars 172 forks source link

feat(desc): use concatenation instead of fmt.Sprintf to append strings #611

Closed tchung1118 closed 6 months ago

tchung1118 commented 6 months ago

MessageDescriptor's FindFieldByName function was identified to be one of the hot functions called at Uber. It builds a fully qualified name for a given field name, and looks it up in the symbol map it keeps. The fully qualified name here is currently created by using fmt.Sprintf.

However, using fmt.Sprintf for simple string concatenation turned out to be performing much worse than using '+' operator because fmt.Sprintf does more processing for parsing verbs and generally makes more allocations.

This PR updates MessageDescriptor.FindFieldByName to use string concatenation for building the fully qualified name for looking up the field, instead of using fmt.Sprintf.

Below is the result of benchmarks added in this PR:

goos: darwin
goarch: arm64
pkg: github.com/jhump/protoreflect/desc
                                 │  before.txt  │              after.txt               │
                                 │    sec/op    │    sec/op     vs base                │
MessageDescriptorFindField20-10    119.25n ± 2%   45.11n ±  5%  -62.18% (p=0.000 n=10)
MessageDescriptorFindField100-10   141.50n ± 2%   58.28n ± 11%  -58.81% (p=0.000 n=10)
MessageDescriptorFindField500-10    235.3n ± 2%   129.5n ±  6%  -44.94% (p=0.000 n=10)
geomean                             158.3n        69.83n        -55.90%

                                 │ before.txt │             after.txt              │
                                 │    B/op    │    B/op     vs base                │
MessageDescriptorFindField20-10    80.00 ± 0%   48.00 ± 0%  -40.00% (p=0.000 n=10)
MessageDescriptorFindField100-10   160.0 ± 0%   128.0 ± 0%  -20.00% (p=0.000 n=10)
MessageDescriptorFindField500-10   608.0 ± 0%   576.0 ± 0%   -5.26% (p=0.000 n=10)
geomean                            198.2        152.4       -23.10%

                                 │ before.txt │             after.txt              │
                                 │ allocs/op  │ allocs/op   vs base                │
MessageDescriptorFindField20-10    3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
MessageDescriptorFindField100-10   3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
MessageDescriptorFindField500-10   3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
geomean                            3.000        1.000       -66.67%

Benchmarks are done by calling MessageDescriptor's FindFieldByName function with names of varying lengths. Considering a reasonable length of a field's name is shorter than 100 characters, this PR should improve the performance of this function by more than 50% on average.