Closed greatroar closed 2 years ago
Another optimization for the amd64 decoder, inspired by one of its comments:
name old speed new speed delta UncompressPg1661-8 1.15GB/s ± 1% 1.19GB/s ± 1% +3.39% (p=0.000 n=10+10) UncompressDigits-8 1.89GB/s ± 0% 2.33GB/s ± 1% +23.46% (p=0.000 n=9+10) UncompressTwain-8 1.19GB/s ± 1% 1.23GB/s ± 0% +3.43% (p=0.000 n=10+10) UncompressRand-8 3.93GB/s ± 2% 3.96GB/s ± 1% ~ (p=0.105 n=10+10)
The effect is most pronounced on Digits because 37.4% of its literals have lengths 17-48. In Twain and Pg1661, this is <4.1%.
This is faster than copying 32 bytes. At 64 bytes, digits gets faster still whlie Twain and Pg1661 get slightly slower.
Another optimization for the amd64 decoder, inspired by one of its comments:
The effect is most pronounced on Digits because 37.4% of its literals have lengths 17-48. In Twain and Pg1661, this is <4.1%.
This is faster than copying 32 bytes. At 64 bytes, digits gets faster still whlie Twain and Pg1661 get slightly slower.