go-hep / hep

hep is the mono repository holding all of go-hep.org/x/hep packages and tools
https://go-hep.org
BSD 3-Clause "New" or "Revised" License
231 stars 37 forks source link

groot/internal/rcompress: reuse buffers when possible #1023

Open bburghgr opened 1 week ago

bburghgr commented 1 week ago

Fixes https://github.com/go-hep/hep/issues/712, but the current implementation is a bit delicate (from pre-generics sync use), so I'm not entirely sure it's worth it for the marginal performance improvements.

This reuses buffers for lz4, zlib and zstd. The other compression algs did not appear to expose an API allowing for buffer reuse. All of this should be threadsafe, but hit hasn't been tested very thoroughly. Nothing is flagged if I build groot-bench with -race and generate files or run benchmarks, for whatever that's worth.

Any performance gains would come from reducing allocations, seen in e.g. go tool pprof -alloc-space on a memprofile, along with a reduction in gc frequency from the reduced allocation rate. There's very little change to overall throughput or memory use (small improvements in some cases):

$ benchstat default.txt pooled.txt 
goos: linux
goarch: amd64
pkg: github.com/go-hep/groot-bench
cpu: Intel(R) Core(TM) i5-10300H CPU @ 2.50GHz
                                       │ default.txt │             pooled.txt             │
                                       │   sec/op    │   sec/op     vs base               │
ReadScalar/GoHEP/None-8                  515.0m ± 1%   512.8m ± 1%       ~ (p=0.221 n=20)
ReadScalar/GoHEP/LZ4-8                   516.6m ± 1%   517.1m ± 0%       ~ (p=0.495 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             474.5m ± 1%   473.2m ± 1%       ~ (p=0.904 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             482.3m ± 1%   479.6m ± 1%       ~ (p=0.355 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8              1.250 ± 1%    1.213 ± 5%  -2.94% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        941.4m ± 1%   933.2m ± 1%       ~ (p=0.056 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         949.4m ± 1%   946.4m ± 1%       ~ (p=0.620 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   946.8m ± 1%   937.1m ± 1%       ~ (p=0.121 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   953.2m ± 1%   940.0m ± 1%  -1.38% (p=0.013 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8    1.982 ± 1%    1.972 ± 1%       ~ (p=0.091 n=20)
ReadScalar/ROOT-TreeReader/None-8         1.125 ± 2%    1.105 ± 2%       ~ (p=0.091 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8          1.145 ± 2%    1.137 ± 2%       ~ (p=0.165 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8    1.120 ± 1%    1.115 ± 1%       ~ (p=0.583 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8    1.138 ± 2%    1.136 ± 2%       ~ (p=0.989 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8    2.161 ± 0%    2.153 ± 1%       ~ (p=0.091 n=20)
geomean                                  945.5m        938.2m       -0.78%

                                       │ default.txt │             pooled.txt              │
                                       │   rss/op    │   rss/op     vs base                │
ReadScalar/GoHEP/None-8                  9.984M ± 1%   9.984M ± 0%        ~ (p=0.966 n=20)
ReadScalar/GoHEP/LZ4-8                   9.984M ± 1%   9.984M ± 1%        ~ (p=0.795 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             9.984M ± 1%   9.856M ± 1%   -1.28% (p=0.016 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             11.26M ± 1%   10.11M ± 1%  -10.23% (p=0.000 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8             12.88M ± 1%   12.68M ± 1%   -1.58% (p=0.004 n=20)
ReadScalar/ROOT-TreeBranch/None-8        320.3M ± 0%   319.9M ± 0%   -0.10% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         320.5M ± 0%   320.2M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   320.3M ± 0%   320.1M ± 0%   -0.07% (p=0.001 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   320.4M ± 0%   320.1M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8   320.2M ± 0%   320.0M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeReader/None-8        330.8M ± 0%   330.9M ± 0%        ~ (p=0.825 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8         331.2M ± 0%   331.3M ± 0%        ~ (p=0.899 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8   331.2M ± 0%   331.2M ± 0%        ~ (p=0.733 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8   331.2M ± 0%   331.2M ± 0%        ~ (p=1.000 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8   331.2M ± 0%   331.2M ± 0%        ~ (p=0.560 n=20)
geomean                                  104.5M        103.5M        -0.93%

                                       │ default.txt │             pooled.txt             │
                                       │   vmem/op   │   vmem/op    vs base               │
ReadScalar/GoHEP/None-8                  1.232G ± 0%   1.232G ± 0%       ~ (p=0.841 n=20)
ReadScalar/GoHEP/LZ4-8                   1.232G ± 0%   1.232G ± 0%  +0.01% (p=0.044 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             1.232G ± 0%   1.232G ± 0%  +0.00% (p=0.021 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             1.233G ± 0%   1.232G ± 0%  -0.05% (p=0.000 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8             1.233G ± 0%   1.233G ± 0%  +0.02% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        538.0M ± 0%   538.0M ± 0%       ~ (p=0.175 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         538.0M ± 0%   538.0M ± 0%       ~ (p=0.987 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   538.0M ± 0%   538.0M ± 0%       ~ (p=0.471 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   538.1M ± 0%   538.1M ± 0%       ~ (p=0.386 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8   538.0M ± 0%   538.0M ± 0%       ~ (p=0.997 n=20)
ReadScalar/ROOT-TreeReader/None-8        555.6M ± 0%   555.6M ± 0%       ~ (p=0.552 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8         555.6M ± 0%   555.6M ± 0%       ~ (p=0.136 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8   555.7M ± 0%   555.7M ± 0%       ~ (p=0.374 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8   555.7M ± 0%   555.7M ± 0%       ~ (p=0.393 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8   555.6M ± 0%   555.6M ± 0%       ~ (p=0.473 n=20)
geomean                                  716.9M        716.9M       -0.00%

                                       │ default.txt │             pooled.txt             │
                                       │   wall/op   │   wall/op    vs base               │
ReadScalar/GoHEP/None-8                  513.8m ± 1%   511.8m ± 1%       ~ (p=0.280 n=20)
ReadScalar/GoHEP/LZ4-8                   514.3m ± 1%   515.3m ± 1%       ~ (p=0.888 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             475.8m ± 1%   474.1m ± 1%       ~ (p=0.602 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             482.4m ± 1%   480.0m ± 0%       ~ (p=0.433 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8              1.250 ± 1%    1.212 ± 4%  -2.96% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        935.6m ± 2%   935.1m ± 1%       ~ (p=0.718 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         947.1m ± 2%   950.1m ± 2%       ~ (p=0.947 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   952.2m ± 2%   932.4m ± 1%  -2.09% (p=0.013 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   945.6m ± 2%   933.4m ± 0%  -1.28% (p=0.006 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8    1.982 ± 1%    1.972 ± 1%       ~ (p=0.082 n=20)
ReadScalar/ROOT-TreeReader/None-8         1.124 ± 2%    1.106 ± 2%       ~ (p=0.090 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8          1.145 ± 2%    1.136 ± 2%       ~ (p=0.151 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8    1.121 ± 1%    1.115 ± 1%       ~ (p=0.597 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8    1.137 ± 2%    1.135 ± 2%       ~ (p=0.984 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8    2.160 ± 0%    2.153 ± 1%       ~ (p=0.087 n=20)
geomean                                  944.5m        937.5m       -0.74%
sbinet commented 1 week ago

yeah, the results are a bit of a let down.

perhaps we could revisit this when/if a generic sync.Map is introduced, as described in: