golang / snappy

The Snappy compression format in the Go programming language.
BSD 3-Clause "New" or "Revised" License
1.52k stars 164 forks source link

amd64 decoder: faster length checks #47

Closed klauspost closed 11 months ago

klauspost commented 5 years ago

R13 contains the pointer to the end of the source buffer, so there is no need to calculate it on every check when reading.

Performance increase depends on how often the code is hit, but seems often around 8% added thoughput.

>benchstat old.txt new.txt
name        old time/op    new time/op    delta
_UFlat0-8     54.5µs ± 5%    50.0µs ± 3%  -8.34%  (p=0.000 n=10+10)
_UFlat1-8      577µs ± 6%     541µs ± 2%  -6.19%  (p=0.000 n=10+10)
_UFlat2-8     8.59µs ± 2%    8.74µs ± 5%    ~      (p=0.287 n=9+10)
_UFlat3-8      129ns ± 4%     125ns ± 3%  -3.45%    (p=0.001 n=9+9)
_UFlat4-8     8.36µs ± 3%    7.68µs ± 4%  -8.15%   (p=0.000 n=9+10)
_UFlat5-8      239µs ± 4%     219µs ± 1%  -8.14%   (p=0.000 n=10+9)
_UFlat6-8      212µs ± 3%     206µs ± 2%  -2.99%  (p=0.001 n=10+10)
_UFlat7-8      178µs ± 1%     175µs ± 1%  -1.90%    (p=0.000 n=9+9)
_UFlat8-8      565µs ± 5%     552µs ± 2%    ~     (p=0.052 n=10+10)
_UFlat9-8      752µs ± 2%     739µs ± 2%  -1.79%  (p=0.007 n=10+10)
_UFlat10-8    48.1µs ± 2%    43.9µs ± 2%  -8.74%  (p=0.000 n=10+10)
_UFlat11-8     199µs ± 4%     197µs ± 2%    ~     (p=0.436 n=10+10)

name        old speed      new speed      delta
_UFlat0-8   1.88GB/s ± 5%  2.05GB/s ± 3%  +9.05%  (p=0.000 n=10+10)
_UFlat1-8   1.22GB/s ± 6%  1.30GB/s ± 2%  +6.49%  (p=0.000 n=10+10)
_UFlat2-8   14.3GB/s ± 2%  14.1GB/s ± 5%    ~      (p=0.315 n=9+10)
_UFlat3-8   1.53GB/s ± 7%  1.59GB/s ± 5%  +3.85%  (p=0.005 n=10+10)
_UFlat4-8   12.2GB/s ± 5%  13.3GB/s ± 4%  +9.50%  (p=0.000 n=10+10)
_UFlat5-8   1.72GB/s ± 4%  1.87GB/s ± 1%  +8.80%   (p=0.000 n=10+9)
_UFlat6-8    716MB/s ± 3%   738MB/s ± 2%  +3.06%  (p=0.001 n=10+10)
_UFlat7-8    702MB/s ± 1%   715MB/s ± 1%  +1.93%    (p=0.000 n=9+9)
_UFlat8-8    756MB/s ± 5%   773MB/s ± 2%  +2.28%  (p=0.050 n=10+10)
_UFlat9-8    641MB/s ± 2%   652MB/s ± 2%  +1.81%  (p=0.006 n=10+10)
_UFlat10-8  2.47GB/s ± 2%  2.70GB/s ± 2%  +9.56%  (p=0.000 n=10+10)
_UFlat11-8   928MB/s ± 4%   936MB/s ± 2%    ~     (p=0.436 n=10+10)