Closed lizthegrey closed 2 years ago
benchcmp stats: amd64:
benchmark old ns/op new ns/op delta
BenchmarkUncompress-12 5.96 5.95 -0.25%
BenchmarkUncompressPg1661-12 264289 260036 -1.61%
BenchmarkUncompressDigits-12 29851 29641 -0.70%
BenchmarkUncompressTwain-12 167690 164982 -1.61%
BenchmarkUncompressRand-12 4572 4132 -9.62%
benchmark old MB/s new MB/s speedup
BenchmarkUncompressPg1661-12 1427.86 1451.21 1.02x
BenchmarkUncompressDigits-12 3192.04 3214.57 1.01x
BenchmarkUncompressTwain-12 1529.03 1554.13 1.02x
BenchmarkUncompressRand-12 3587.52 3969.48 1.11x
benchmark old allocs new allocs delta
BenchmarkUncompress-12 0 0 +0.00%
BenchmarkUncompressPg1661-12 4 4 +0.00%
BenchmarkUncompressDigits-12 4 4 +0.00%
BenchmarkUncompressTwain-12 4 4 +0.00%
BenchmarkUncompressRand-12 4 4 +0.00%
benchmark old bytes new bytes delta
BenchmarkUncompress-12 0 0 +0.00%
BenchmarkUncompressPg1661-12 184 184 +0.00%
BenchmarkUncompressDigits-12 184 190 +3.26%
BenchmarkUncompressTwain-12 184 184 +0.00%
BenchmarkUncompressRand-12 185 185 +0.00%
arm64:
benchmark old ns/op new ns/op delta
BenchmarkUncompress-4 9.21 9.13 -0.88%
BenchmarkUncompressPg1661-4 946356 954336 +0.84%
BenchmarkUncompressDigits-4 62271 61885 -0.62%
BenchmarkUncompressTwain-4 598823 599040 +0.04%
BenchmarkUncompressRand-4 4577 4510 -1.46%
benchmark old MB/s new MB/s speedup
BenchmarkUncompressPg1661-4 398.76 395.42 0.99x
BenchmarkUncompressDigits-4 1530.14 1539.69 1.01x
BenchmarkUncompressTwain-4 428.18 428.02 1.00x
BenchmarkUncompressRand-4 3583.71 3637.39 1.01x
benchmark old allocs new allocs delta
BenchmarkUncompress-4 0 0 +0.00%
BenchmarkUncompressPg1661-4 4 4 +0.00%
BenchmarkUncompressDigits-4 4 4 +0.00%
BenchmarkUncompressTwain-4 4 4 +0.00%
BenchmarkUncompressRand-4 4 4 +0.00%
benchmark old bytes new bytes delta
BenchmarkUncompress-4 0 0 +0.00%
BenchmarkUncompressPg1661-4 184 184 +0.00%
BenchmarkUncompressDigits-4 184 197 +7.07%
BenchmarkUncompressTwain-4 707 184 -73.97%
BenchmarkUncompressRand-4 186 185 -0.54%
however, this will have a much larger effect on longer files where Read() is called many more times.
Hm, this didn't have the effect I wanted at scale. I'll keep tweaking.
See https://gist.github.com/lizthegrey/0ce7f8cd4a70ecedb5c299dfc0332976 for full disassembly
A huge amount of call overhead is incurred running the defer state.check() that can be avoided on nil err.