valyala / gozstd

go wrapper for zstd
MIT License
420 stars 60 forks source link

Use go slices #49

Open mhr3 opened 1 year ago

mhr3 commented 1 year ago

Fixes #48, #33

I've rewritten how the CGO wrapper is done to achieve two things:

Here's benchmark results against master (run on M1 using go 1.18):

CPU time ``` name old time/op new time/op delta DecompressDict/blockSize_1/level_3-8 21.3ns ± 2% 25.5ns ± 2% +19.57% (p=0.016 n=4+5) DecompressDict/blockSize_1/level_5-8 21.0ns ± 2% 25.2ns ± 2% +20.02% (p=0.016 n=4+5) DecompressDict/blockSize_1/level_10-8 21.3ns ± 5% 25.4ns ± 3% +19.18% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_3-8 20.8ns ± 4% 25.2ns ± 1% +20.92% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_5-8 20.9ns ± 3% 25.8ns ± 2% +23.59% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_10-8 26.7ns ± 2% 31.2ns ± 1% +16.90% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_3-8 26.7ns ± 0% 32.1ns ± 1% +20.39% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_5-8 24.7ns ± 0% 30.6ns ± 1% +24.08% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_10-8 23.5ns ± 0% 28.8ns ± 1% +22.55% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_3-8 165ns ± 0% 175ns ± 0% +6.00% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_5-8 416ns ± 1% 425ns ± 1% +2.33% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_10-8 424ns ± 4% 421ns ± 1% ~ (p=1.000 n=4+5) DecompressDict/blockSize_10000/level_3-8 1.51µs ± 2% 1.51µs ± 1% ~ (p=0.952 n=4+5) DecompressDict/blockSize_10000/level_5-8 1.71µs ± 0% 1.72µs ± 1% +0.62% (p=0.032 n=4+5) DecompressDict/blockSize_10000/level_10-8 1.61µs ± 0% 1.61µs ± 1% ~ (p=0.952 n=4+5) DecompressDict/blockSize_100000/level_3-8 13.8µs ± 1% 13.7µs ± 1% ~ (p=0.286 n=4+5) DecompressDict/blockSize_100000/level_5-8 13.5µs ± 1% 13.6µs ± 1% ~ (p=1.000 n=4+5) DecompressDict/blockSize_100000/level_10-8 12.5µs ± 1% 12.5µs ± 3% ~ (p=0.730 n=4+5) DecompressDict/blockSize_300000/level_3-8 37.8µs ± 1% 37.6µs ± 1% ~ (p=0.413 n=4+5) DecompressDict/blockSize_300000/level_5-8 44.2µs ± 1% 43.7µs ± 0% ~ (p=0.095 n=4+5) DecompressDict/blockSize_300000/level_10-8 36.4µs ± 1% 36.1µs ± 1% ~ (p=0.190 n=4+5) CompressDict/blockSize_1/level_3-8 38.9ns ± 0% 46.0ns ± 1% +18.26% (p=0.016 n=4+5) CompressDict/blockSize_1/level_5-8 38.9ns ± 0% 45.9ns ± 3% +17.95% (p=0.016 n=4+5) CompressDict/blockSize_1/level_10-8 39.5ns ± 1% 46.2ns ± 0% +16.81% (p=0.016 n=4+5) CompressDict/blockSize_10/level_3-8 159ns ± 1% 167ns ± 2% +5.36% (p=0.016 n=4+5) CompressDict/blockSize_10/level_5-8 170ns ± 1% 178ns ± 2% +4.85% (p=0.016 n=4+5) CompressDict/blockSize_10/level_10-8 178ns ± 1% 185ns ± 1% +4.29% (p=0.016 n=4+5) CompressDict/blockSize_100/level_3-8 81.0ns ± 0% 89.0ns ± 1% +9.91% (p=0.016 n=4+5) CompressDict/blockSize_100/level_5-8 199ns ± 1% 206ns ± 1% +3.51% (p=0.016 n=4+5) CompressDict/blockSize_100/level_10-8 186ns ± 1% 192ns ± 1% +3.40% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_3-8 399ns ± 1% 410ns ± 1% +2.54% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_5-8 1.26µs ± 1% 1.28µs ± 0% +0.91% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_10-8 2.33µs ± 2% 2.35µs ± 2% ~ (p=0.905 n=4+5) CompressDict/blockSize_10000/level_3-8 4.09µs ± 2% 4.10µs ± 1% ~ (p=0.952 n=4+5) CompressDict/blockSize_10000/level_5-8 10.1µs ± 0% 10.0µs ± 1% -0.88% (p=0.000 n=4+5) CompressDict/blockSize_10000/level_10-8 24.4µs ± 1% 24.3µs ± 0% ~ (p=0.190 n=4+5) CompressDict/blockSize_100000/level_3-8 36.7µs ± 1% 36.6µs ± 1% ~ (p=0.413 n=4+5) CompressDict/blockSize_100000/level_5-8 94.3µs ± 1% 93.2µs ± 1% -1.12% (p=0.032 n=4+5) CompressDict/blockSize_100000/level_10-8 159µs ± 1% 159µs ± 1% ~ (p=0.730 n=4+5) CompressDict/blockSize_300000/level_3-8 149µs ± 1% 149µs ± 1% ~ (p=0.730 n=4+5) CompressDict/blockSize_300000/level_5-8 346µs ± 1% 344µs ± 1% ~ (p=0.413 n=4+5) CompressDict/blockSize_300000/level_10-8 1.21ms ± 1% 1.19ms ± 1% -1.45% (p=0.016 n=4+5) Compress/blockSize_1/level_3-8 25.0ns ± 0% 29.8ns ± 0% +19.10% (p=0.016 n=4+5) Compress/blockSize_1/level_5-8 25.1ns ± 0% 29.8ns ± 0% +18.95% (p=0.016 n=4+5) Compress/blockSize_1/level_10-8 25.4ns ± 0% 30.3ns ± 1% +19.24% (p=0.016 n=4+5) Compress/blockSize_10/level_3-8 42.1ns ± 0% 48.1ns ± 0% +14.33% (p=0.016 n=4+5) Compress/blockSize_10/level_5-8 43.0ns ± 1% 48.8ns ± 1% +13.27% (p=0.016 n=4+5) Compress/blockSize_10/level_10-8 44.7ns ± 2% 50.2ns ± 1% +12.24% (p=0.016 n=4+5) Compress/blockSize_100/level_3-8 172ns ± 1% 178ns ± 1% +3.67% (p=0.016 n=4+5) Compress/blockSize_100/level_5-8 236ns ± 0% 242ns ± 1% +2.69% (p=0.016 n=4+5) Compress/blockSize_100/level_10-8 263ns ± 1% 270ns ± 1% +2.81% (p=0.016 n=4+5) Compress/blockSize_1000/level_3-8 818ns ± 1% 821ns ± 1% ~ (p=0.730 n=4+5) Compress/blockSize_1000/level_5-8 1.22µs ± 1% 1.20µs ± 1% -1.66% (p=0.016 n=4+5) Compress/blockSize_1000/level_10-8 1.72µs ± 1% 1.74µs ± 1% ~ (p=0.079 n=4+5) Compress/blockSize_10000/level_3-8 4.38µs ± 1% 4.41µs ± 1% ~ (p=0.413 n=4+5) Compress/blockSize_10000/level_5-8 9.54µs ± 1% 9.09µs ± 1% -4.68% (p=0.016 n=4+5) Compress/blockSize_10000/level_10-8 39.6µs ± 3% 39.1µs ± 1% ~ (p=0.730 n=4+5) Compress/blockSize_100000/level_3-8 35.6µs ± 1% 35.2µs ± 3% ~ (p=0.556 n=4+5) Compress/blockSize_100000/level_5-8 89.5µs ± 1% 86.2µs ± 1% -3.69% (p=0.016 n=4+5) Compress/blockSize_100000/level_10-8 462µs ± 0% 459µs ± 1% ~ (p=0.190 n=4+5) Compress/blockSize_300000/level_3-8 123µs ± 1% 122µs ± 1% -1.10% (p=0.032 n=4+5) Compress/blockSize_300000/level_5-8 319µs ± 1% 322µs ± 3% ~ (p=0.556 n=4+5) Compress/blockSize_300000/level_10-8 1.08ms ± 1% 1.09ms ± 3% ~ (p=0.730 n=4+5) Decompress/blockSize_1/level_3-8 20.6ns ± 3% 23.8ns ± 3% +15.18% (p=0.016 n=4+5) Decompress/blockSize_1/level_5-8 20.7ns ± 2% 23.7ns ± 4% +14.34% (p=0.016 n=4+5) Decompress/blockSize_1/level_10-8 20.5ns ± 1% 23.2ns ± 4% +13.09% (p=0.016 n=4+5) Decompress/blockSize_10/level_3-8 21.2ns ± 3% 23.4ns ± 3% +10.56% (p=0.016 n=4+5) Decompress/blockSize_10/level_5-8 21.2ns ± 2% 23.8ns ± 4% +11.96% (p=0.016 n=4+5) Decompress/blockSize_10/level_10-8 20.7ns ± 3% 23.6ns ± 2% +14.14% (p=0.016 n=4+5) Decompress/blockSize_100/level_3-8 35.2ns ± 0% 39.9ns ± 0% +13.37% (p=0.016 n=4+5) Decompress/blockSize_100/level_5-8 35.5ns ± 0% 40.0ns ± 1% +12.54% (p=0.016 n=4+5) Decompress/blockSize_100/level_10-8 35.8ns ± 1% 40.2ns ± 1% +12.27% (p=0.016 n=4+5) Decompress/blockSize_1000/level_3-8 449ns ± 1% 455ns ± 0% +1.26% (p=0.016 n=4+5) Decompress/blockSize_1000/level_5-8 448ns ± 0% 458ns ± 1% +2.03% (p=0.016 n=4+5) Decompress/blockSize_1000/level_10-8 444ns ± 0% 470ns ± 3% +5.93% (p=0.016 n=4+5) Decompress/blockSize_10000/level_3-8 1.75µs ± 2% 1.74µs ± 1% ~ (p=1.000 n=4+5) Decompress/blockSize_10000/level_5-8 1.74µs ± 1% 1.74µs ± 0% ~ (p=1.000 n=4+5) Decompress/blockSize_10000/level_10-8 1.71µs ± 1% 1.73µs ± 1% ~ (p=0.063 n=4+5) Decompress/blockSize_100000/level_3-8 12.8µs ± 1% 12.5µs ± 0% -2.27% (p=0.029 n=4+4) Decompress/blockSize_100000/level_5-8 14.7µs ± 1% 14.3µs ± 1% -2.78% (p=0.016 n=4+5) Decompress/blockSize_100000/level_10-8 12.7µs ± 0% 12.5µs ± 0% -2.06% (p=0.016 n=4+5) Decompress/blockSize_300000/level_3-8 39.0µs ± 1% 37.8µs ± 2% -3.13% (p=0.016 n=4+5) Decompress/blockSize_300000/level_5-8 45.2µs ± 1% 43.8µs ± 0% -3.23% (p=0.016 n=4+5) Decompress/blockSize_300000/level_10-8 37.0µs ± 1% 36.6µs ± 2% ~ (p=0.111 n=4+5) ReaderDict/blockSize_1/level_3-8 64.3ns ± 1% 58.1ns ± 3% -9.62% (p=0.016 n=4+5) ReaderDict/blockSize_1/level_5-8 66.6ns ± 2% 57.6ns ± 1% -13.54% (p=0.016 n=4+5) ReaderDict/blockSize_1/level_10-8 70.3ns ± 0% 63.9ns ± 1% -9.08% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_3-8 73.6ns ± 1% 68.7ns ± 2% -6.66% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_5-8 71.7ns ± 0% 66.3ns ± 1% -7.54% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_10-8 71.4ns ± 3% 64.8ns ± 1% -9.22% (p=0.016 n=4+5) ReaderDict/blockSize_100/level_3-8 218ns ± 1% 211ns ± 1% -3.15% (p=0.016 n=4+5) ReaderDict/blockSize_100/level_5-8 464ns ± 1% 462ns ± 1% ~ (p=0.556 n=4+5) ReaderDict/blockSize_100/level_10-8 459ns ± 1% 456ns ± 1% ~ (p=0.190 n=4+5) ReaderDict/blockSize_1000/level_3-8 1.62µs ± 0% 1.58µs ± 1% -2.18% (p=0.016 n=4+5) ReaderDict/blockSize_1000/level_5-8 1.84µs ± 1% 1.78µs ± 1% -3.29% (p=0.016 n=4+5) ReaderDict/blockSize_1000/level_10-8 1.73µs ± 1% 1.71µs ± 8% ~ (p=0.190 n=4+5) ReaderDict/blockSize_10000/level_3-8 14.9µs ± 1% 14.5µs ± 2% -2.63% (p=0.016 n=4+5) ReaderDict/blockSize_10000/level_5-8 14.8µs ± 1% 14.4µs ± 2% -2.36% (p=0.016 n=4+5) ReaderDict/blockSize_10000/level_10-8 13.7µs ± 0% 13.2µs ± 2% -3.62% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_3-8 149µs ± 2% 138µs ± 1% -6.94% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_5-8 159µs ± 2% 150µs ± 1% -5.74% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_10-8 131µs ± 1% 123µs ± 1% -5.78% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_3-8 511µs ± 1% 479µs ± 2% -6.29% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_5-8 531µs ± 1% 494µs ± 1% -6.93% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_10-8 415µs ± 1% 396µs ± 1% -4.57% (p=0.016 n=4+5) Reader/blockSize_1/level_3-8 63.6ns ± 1% 56.6ns ± 2% -10.96% (p=0.016 n=4+5) Reader/blockSize_1/level_5-8 64.2ns ± 1% 57.1ns ± 1% -11.03% (p=0.016 n=4+5) Reader/blockSize_1/level_10-8 66.1ns ± 4% 57.4ns ± 1% -13.14% (p=0.016 n=4+5) Reader/blockSize_10/level_3-8 83.4ns ± 9% 75.1ns ± 1% -9.92% (p=0.016 n=4+5) Reader/blockSize_10/level_5-8 81.4ns ± 1% 75.9ns ± 3% -6.80% (p=0.016 n=4+5) Reader/blockSize_10/level_10-8 83.3ns ± 5% 76.2ns ± 1% -8.57% (p=0.016 n=4+5) Reader/blockSize_100/level_3-8 496ns ± 1% 491ns ± 1% ~ (p=0.190 n=4+5) Reader/blockSize_100/level_5-8 496ns ± 1% 491ns ± 1% ~ (p=0.111 n=4+5) Reader/blockSize_100/level_10-8 490ns ± 1% 487ns ± 0% -0.71% (p=0.016 n=4+5) Reader/blockSize_1000/level_3-8 1.87µs ± 0% 1.82µs ± 1% -2.47% (p=0.016 n=4+5) Reader/blockSize_1000/level_5-8 1.87µs ± 0% 1.83µs ± 0% -2.19% (p=0.016 n=4+5) Reader/blockSize_1000/level_10-8 1.85µs ± 0% 1.80µs ± 1% -2.61% (p=0.016 n=4+5) Reader/blockSize_10000/level_3-8 13.9µs ± 1% 13.6µs ± 2% ~ (p=0.190 n=4+5) Reader/blockSize_10000/level_5-8 15.9µs ± 2% 15.5µs ± 2% ~ (p=0.190 n=4+5) Reader/blockSize_10000/level_10-8 13.8µs ± 1% 13.5µs ± 2% ~ (p=0.111 n=4+5) Reader/blockSize_100000/level_3-8 149µs ± 1% 141µs ± 3% -5.23% (p=0.016 n=4+5) Reader/blockSize_100000/level_5-8 162µs ± 1% 153µs ± 2% -5.70% (p=0.016 n=4+5) Reader/blockSize_100000/level_10-8 132µs ± 2% 125µs ± 1% -5.22% (p=0.016 n=4+5) Reader/blockSize_300000/level_3-8 514µs ± 1% 480µs ± 1% -6.62% (p=0.016 n=4+5) Reader/blockSize_300000/level_5-8 536µs ± 2% 502µs ± 1% -6.26% (p=0.016 n=4+5) Reader/blockSize_300000/level_10-8 422µs ± 1% 403µs ± 4% ~ (p=0.063 n=4+5) StreamCompress/blockSize_1/level_3-8 6.36µs ± 2% 6.32µs ± 2% ~ (p=0.365 n=4+5) StreamCompress/blockSize_1/level_5-8 47.7µs ± 0% 48.1µs ± 1% +0.87% (p=0.032 n=4+5) StreamCompress/blockSize_1/level_10-8 378µs ± 1% 380µs ± 1% ~ (p=0.190 n=4+5) StreamCompress/blockSize_10/level_3-8 6.19µs ± 0% 6.22µs ± 0% ~ (p=0.063 n=4+5) StreamCompress/blockSize_10/level_5-8 47.6µs ± 1% 48.4µs ± 1% +1.67% (p=0.016 n=4+5) StreamCompress/blockSize_10/level_10-8 377µs ± 1% 379µs ± 0% ~ (p=0.111 n=4+5) StreamCompress/blockSize_100/level_3-8 5.66µs ± 2% 5.66µs ± 1% ~ (p=1.000 n=4+5) StreamCompress/blockSize_100/level_5-8 46.5µs ± 1% 46.3µs ± 1% ~ (p=0.905 n=4+5) StreamCompress/blockSize_100/level_10-8 379µs ± 1% 379µs ± 0% ~ (p=1.000 n=4+5) StreamCompress/blockSize_1000/level_3-8 5.39µs ± 2% 5.42µs ± 1% ~ (p=0.413 n=4+5) StreamCompress/blockSize_1000/level_5-8 36.1µs ± 3% 36.7µs ± 4% ~ (p=0.556 n=4+5) StreamCompress/blockSize_1000/level_10-8 389µs ± 1% 406µs ± 3% +4.28% (p=0.016 n=4+5) StreamCompress/blockSize_10000/level_3-8 38.9µs ± 2% 37.3µs ± 1% -4.09% (p=0.016 n=4+5) StreamCompress/blockSize_10000/level_5-8 112µs ± 2% 111µs ± 3% ~ (p=0.413 n=4+5) StreamCompress/blockSize_10000/level_10-8 707µs ± 1% 711µs ± 2% ~ (p=0.413 n=4+5) StreamCompress/blockSize_100000/level_3-8 535µs ± 1% 512µs ± 2% -4.34% (p=0.016 n=4+5) StreamCompress/blockSize_100000/level_5-8 1.46ms ± 2% 1.41ms ± 2% -3.44% (p=0.016 n=4+5) StreamCompress/blockSize_100000/level_10-8 5.04ms ± 2% 5.02ms ± 1% ~ (p=0.905 n=4+5) StreamCompress/blockSize_300000/level_3-8 1.88ms ± 1% 1.82ms ± 1% -3.13% (p=0.016 n=4+5) StreamCompress/blockSize_300000/level_5-8 5.63ms ± 3% 5.54ms ± 4% ~ (p=0.111 n=4+5) StreamCompress/blockSize_300000/level_10-8 18.4ms ± 9% 17.8ms ± 1% ~ (p=0.730 n=4+5) StreamDecompress/blockSize_1/level_3-8 68.6ns ± 3% 57.4ns ± 1% -16.40% (p=0.016 n=4+5) StreamDecompress/blockSize_1/level_5-8 70.9ns ± 5% 60.6ns ± 9% -14.47% (p=0.016 n=4+5) StreamDecompress/blockSize_1/level_10-8 71.8ns ± 6% 61.3ns ±13% -14.62% (p=0.016 n=4+5) StreamDecompress/blockSize_10/level_3-8 87.1ns ± 6% 81.7ns ±11% ~ (p=0.286 n=4+5) StreamDecompress/blockSize_10/level_5-8 88.9ns ± 6% 76.9ns ± 1% -13.55% (p=0.029 n=4+4) StreamDecompress/blockSize_10/level_10-8 88.0ns ± 4% 80.0ns ± 6% ~ (p=0.063 n=4+5) StreamDecompress/blockSize_100/level_3-8 500ns ± 1% 489ns ± 0% -2.22% (p=0.016 n=4+5) StreamDecompress/blockSize_100/level_5-8 493ns ± 0% 491ns ± 1% ~ (p=0.302 n=4+5) StreamDecompress/blockSize_100/level_10-8 487ns ± 1% 487ns ± 1% ~ (p=1.000 n=4+5) StreamDecompress/blockSize_1000/level_3-8 1.86µs ± 1% 1.82µs ± 0% -2.14% (p=0.029 n=4+4) StreamDecompress/blockSize_1000/level_5-8 1.86µs ± 0% 1.84µs ± 1% -1.17% (p=0.032 n=4+5) StreamDecompress/blockSize_1000/level_10-8 1.84µs ± 1% 1.83µs ± 1% ~ (p=0.286 n=4+5) StreamDecompress/blockSize_10000/level_3-8 13.6µs ± 2% 13.7µs ± 3% ~ (p=0.794 n=4+5) StreamDecompress/blockSize_10000/level_5-8 15.6µs ± 1% 15.1µs ± 1% -2.70% (p=0.016 n=4+5) StreamDecompress/blockSize_10000/level_10-8 13.4µs ± 0% 13.2µs ± 1% -1.41% (p=0.016 n=4+5) StreamDecompress/blockSize_100000/level_3-8 144µs ± 1% 141µs ± 1% -2.29% (p=0.016 n=4+5) StreamDecompress/blockSize_100000/level_5-8 157µs ± 1% 153µs ± 2% ~ (p=0.063 n=4+5) StreamDecompress/blockSize_100000/level_10-8 128µs ± 1% 126µs ± 1% -1.72% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_3-8 492µs ± 0% 482µs ± 1% -1.98% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_5-8 510µs ± 0% 503µs ± 1% -1.37% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_10-8 404µs ± 0% 398µs ± 1% -1.45% (p=0.016 n=4+5) WriterDict/blockSize_1/level_3-8 255ns ± 1% 264ns ± 1% +3.72% (p=0.016 n=4+5) WriterDict/blockSize_1/level_5-8 297ns ± 1% 309ns ± 2% +4.03% (p=0.016 n=4+5) WriterDict/blockSize_1/level_10-8 302ns ± 1% 315ns ± 1% +4.27% (p=0.016 n=4+5) WriterDict/blockSize_10/level_3-8 180ns ± 1% 191ns ± 0% +5.99% (p=0.016 n=4+5) WriterDict/blockSize_10/level_5-8 326ns ± 1% 335ns ± 1% +2.66% (p=0.016 n=4+5) WriterDict/blockSize_10/level_10-8 312ns ± 2% 317ns ± 1% ~ (p=0.190 n=4+5) WriterDict/blockSize_100/level_3-8 502ns ± 2% 511ns ± 1% ~ (p=0.111 n=4+5) WriterDict/blockSize_100/level_5-8 1.46µs ± 1% 1.44µs ± 1% -1.16% (p=0.032 n=4+5) WriterDict/blockSize_100/level_10-8 2.59µs ± 1% 2.53µs ± 1% -2.28% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_3-8 4.59µs ± 1% 4.45µs ± 2% -3.02% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_5-8 11.6µs ± 2% 11.0µs ± 1% -4.56% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_10-8 27.3µs ± 1% 26.1µs ± 1% -4.55% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_3-8 39.8µs ± 0% 38.1µs ± 1% -4.37% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_5-8 116µs ± 2% 110µs ± 1% -4.87% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_10-8 244µs ± 2% 232µs ± 2% -4.72% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_3-8 409µs ± 2% 393µs ± 0% -3.70% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_5-8 1.06ms ± 2% 1.02ms ± 1% -3.63% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_10-8 2.26ms ± 3% 2.17ms ± 2% -3.97% (p=0.032 n=4+5) WriterDict/blockSize_300000/level_3-8 1.26ms ± 1% 1.23ms ± 2% -2.87% (p=0.016 n=4+5) WriterDict/blockSize_300000/level_5-8 3.45ms ± 4% 3.26ms ± 4% -5.62% (p=0.016 n=4+5) WriterDict/blockSize_300000/level_10-8 7.34ms ± 8% 6.88ms ± 8% ~ (p=0.190 n=4+5) Writer/blockSize_1/level_3-8 6.34µs ± 1% 6.30µs ± 0% ~ (p=0.063 n=4+5) Writer/blockSize_1/level_5-8 48.8µs ± 2% 48.4µs ± 2% ~ (p=0.730 n=4+5) Writer/blockSize_1/level_10-8 395µs ± 3% 386µs ± 1% ~ (p=0.111 n=4+5) Writer/blockSize_10/level_3-8 6.23µs ± 1% 6.28µs ± 3% ~ (p=0.730 n=4+5) Writer/blockSize_10/level_5-8 48.2µs ± 3% 48.9µs ± 3% ~ (p=0.730 n=4+5) Writer/blockSize_10/level_10-8 386µs ± 1% 387µs ± 1% ~ (p=0.730 n=4+5) Writer/blockSize_100/level_3-8 5.68µs ± 2% 5.63µs ± 2% ~ (p=0.286 n=4+5) Writer/blockSize_100/level_5-8 46.3µs ± 2% 46.5µs ± 0% ~ (p=0.286 n=4+5) Writer/blockSize_100/level_10-8 385µs ± 1% 383µs ± 1% ~ (p=0.413 n=4+5) Writer/blockSize_1000/level_3-8 5.45µs ± 3% 5.36µs ± 2% ~ (p=0.190 n=4+5) Writer/blockSize_1000/level_5-8 35.4µs ± 1% 37.0µs ± 3% +4.53% (p=0.016 n=4+5) Writer/blockSize_1000/level_10-8 397µs ± 1% 398µs ± 1% ~ (p=0.730 n=4+5) Writer/blockSize_10000/level_3-8 39.3µs ± 2% 38.4µs ± 2% ~ (p=0.063 n=4+5) Writer/blockSize_10000/level_5-8 115µs ± 2% 112µs ± 4% ~ (p=0.190 n=4+5) Writer/blockSize_10000/level_10-8 718µs ± 0% 716µs ± 2% ~ (p=0.286 n=4+5) Writer/blockSize_100000/level_3-8 538µs ± 3% 526µs ± 1% ~ (p=0.286 n=4+5) Writer/blockSize_100000/level_5-8 1.46ms ± 2% 1.45ms ± 4% ~ (p=0.413 n=4+5) Writer/blockSize_100000/level_10-8 5.03ms ± 2% 5.11ms ± 3% ~ (p=0.286 n=4+5) Writer/blockSize_300000/level_3-8 1.92ms ± 2% 1.88ms ± 1% ~ (p=0.111 n=4+5) Writer/blockSize_300000/level_5-8 5.55ms ± 1% 5.50ms ± 2% ~ (p=0.413 n=4+5) Writer/blockSize_300000/level_10-8 18.2ms ± 2% 18.0ms ± 2% ~ (p=0.111 n=4+5) WriterResetAlloc-8 142ns ± 6% 163ns ± 0% +15.02% (p=0.029 n=4+4) ```
Throughput ``` name old speed new speed delta DecompressDict/blockSize_1/level_3-8 46.9MB/s ± 2% 39.2MB/s ± 2% -16.34% (p=0.016 n=4+5) DecompressDict/blockSize_1/level_5-8 47.6MB/s ± 2% 39.7MB/s ± 2% -16.70% (p=0.016 n=4+5) DecompressDict/blockSize_1/level_10-8 47.0MB/s ± 5% 39.4MB/s ± 4% -16.14% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_3-8 480MB/s ± 4% 397MB/s ± 1% -17.34% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_5-8 479MB/s ± 3% 387MB/s ± 2% -19.13% (p=0.016 n=4+5) DecompressDict/blockSize_10/level_10-8 375MB/s ± 2% 321MB/s ± 1% -14.48% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_3-8 3.75GB/s ± 0% 3.11GB/s ± 1% -16.93% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_5-8 4.05GB/s ± 0% 3.27GB/s ± 1% -19.40% (p=0.016 n=4+5) DecompressDict/blockSize_100/level_10-8 4.26GB/s ± 0% 3.48GB/s ± 1% -18.40% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_3-8 6.06GB/s ± 0% 5.72GB/s ± 0% -5.66% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_5-8 2.41GB/s ± 1% 2.35GB/s ± 1% -2.27% (p=0.016 n=4+5) DecompressDict/blockSize_1000/level_10-8 2.36GB/s ± 4% 2.37GB/s ± 1% ~ (p=1.000 n=4+5) DecompressDict/blockSize_10000/level_3-8 6.61GB/s ± 2% 6.62GB/s ± 1% ~ (p=0.905 n=4+5) DecompressDict/blockSize_10000/level_5-8 5.83GB/s ± 0% 5.80GB/s ± 1% -0.63% (p=0.032 n=4+5) DecompressDict/blockSize_10000/level_10-8 6.21GB/s ± 0% 6.21GB/s ± 1% ~ (p=0.905 n=4+5) DecompressDict/blockSize_100000/level_3-8 7.27GB/s ± 1% 7.31GB/s ± 1% ~ (p=0.286 n=4+5) DecompressDict/blockSize_100000/level_5-8 7.39GB/s ± 1% 7.36GB/s ± 1% ~ (p=1.000 n=4+5) DecompressDict/blockSize_100000/level_10-8 8.01GB/s ± 1% 8.00GB/s ± 2% ~ (p=0.730 n=4+5) DecompressDict/blockSize_300000/level_3-8 7.94GB/s ± 1% 7.98GB/s ± 1% ~ (p=0.413 n=4+5) DecompressDict/blockSize_300000/level_5-8 6.79GB/s ± 1% 6.86GB/s ± 0% ~ (p=0.111 n=4+5) DecompressDict/blockSize_300000/level_10-8 8.24GB/s ± 1% 8.31GB/s ± 1% ~ (p=0.190 n=4+5) CompressDict/blockSize_1/level_3-8 25.7MB/s ± 0% 21.7MB/s ± 1% -15.44% (p=0.016 n=4+5) CompressDict/blockSize_1/level_5-8 25.7MB/s ± 0% 21.8MB/s ± 3% -15.21% (p=0.016 n=4+5) CompressDict/blockSize_1/level_10-8 25.3MB/s ± 1% 21.7MB/s ± 0% -14.39% (p=0.016 n=4+5) CompressDict/blockSize_10/level_3-8 63.0MB/s ± 1% 59.8MB/s ± 2% -5.09% (p=0.016 n=4+5) CompressDict/blockSize_10/level_5-8 59.0MB/s ± 1% 56.3MB/s ± 2% -4.62% (p=0.016 n=4+5) CompressDict/blockSize_10/level_10-8 56.3MB/s ± 1% 54.0MB/s ± 1% -4.11% (p=0.016 n=4+5) CompressDict/blockSize_100/level_3-8 1.24GB/s ± 0% 1.12GB/s ± 1% -9.02% (p=0.016 n=4+5) CompressDict/blockSize_100/level_5-8 503MB/s ± 1% 486MB/s ± 1% -3.38% (p=0.016 n=4+5) CompressDict/blockSize_100/level_10-8 538MB/s ± 1% 521MB/s ± 1% -3.29% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_3-8 2.50GB/s ± 1% 2.44GB/s ± 1% -2.47% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_5-8 791MB/s ± 1% 784MB/s ± 0% -0.90% (p=0.016 n=4+5) CompressDict/blockSize_1000/level_10-8 429MB/s ± 2% 426MB/s ± 2% ~ (p=0.905 n=4+5) CompressDict/blockSize_10000/level_3-8 2.45GB/s ± 2% 2.44GB/s ± 1% ~ (p=0.905 n=4+5) CompressDict/blockSize_10000/level_5-8 989MB/s ± 0% 998MB/s ± 1% +0.89% (p=0.016 n=4+5) CompressDict/blockSize_10000/level_10-8 410MB/s ± 1% 412MB/s ± 0% ~ (p=0.190 n=4+5) CompressDict/blockSize_100000/level_3-8 2.72GB/s ± 1% 2.73GB/s ± 1% ~ (p=0.413 n=4+5) CompressDict/blockSize_100000/level_5-8 1.06GB/s ± 1% 1.07GB/s ± 1% +1.13% (p=0.032 n=4+5) CompressDict/blockSize_100000/level_10-8 628MB/s ± 1% 629MB/s ± 1% ~ (p=0.730 n=4+5) CompressDict/blockSize_300000/level_3-8 2.01GB/s ± 1% 2.01GB/s ± 1% ~ (p=0.730 n=4+5) CompressDict/blockSize_300000/level_5-8 867MB/s ± 1% 871MB/s ± 1% ~ (p=0.413 n=4+5) CompressDict/blockSize_300000/level_10-8 249MB/s ± 1% 253MB/s ± 1% +1.47% (p=0.016 n=4+5) Compress/blockSize_1/level_3-8 40.0MB/s ± 0% 33.5MB/s ± 0% -16.04% (p=0.016 n=4+5) Compress/blockSize_1/level_5-8 39.9MB/s ± 0% 33.5MB/s ± 0% -15.92% (p=0.016 n=4+5) Compress/blockSize_1/level_10-8 39.3MB/s ± 0% 33.0MB/s ± 1% -16.13% (p=0.016 n=4+5) Compress/blockSize_10/level_3-8 238MB/s ± 0% 208MB/s ± 0% -12.53% (p=0.016 n=4+5) Compress/blockSize_10/level_5-8 232MB/s ± 1% 205MB/s ± 1% -11.72% (p=0.016 n=4+5) Compress/blockSize_10/level_10-8 224MB/s ± 2% 199MB/s ± 1% -10.91% (p=0.016 n=4+5) Compress/blockSize_100/level_3-8 582MB/s ± 1% 561MB/s ± 1% -3.56% (p=0.016 n=4+5) Compress/blockSize_100/level_5-8 424MB/s ± 0% 413MB/s ± 1% -2.61% (p=0.016 n=4+5) Compress/blockSize_100/level_10-8 380MB/s ± 1% 370MB/s ± 1% -2.74% (p=0.016 n=4+5) Compress/blockSize_1000/level_3-8 1.22GB/s ± 1% 1.22GB/s ± 1% ~ (p=0.730 n=4+5) Compress/blockSize_1000/level_5-8 820MB/s ± 1% 833MB/s ± 1% +1.68% (p=0.016 n=4+5) Compress/blockSize_1000/level_10-8 581MB/s ± 1% 576MB/s ± 1% ~ (p=0.063 n=4+5) Compress/blockSize_10000/level_3-8 2.28GB/s ± 1% 2.27GB/s ± 1% ~ (p=0.413 n=4+5) Compress/blockSize_10000/level_5-8 1.05GB/s ± 1% 1.10GB/s ± 1% +4.91% (p=0.016 n=4+5) Compress/blockSize_10000/level_10-8 253MB/s ± 3% 255MB/s ± 1% ~ (p=0.730 n=4+5) Compress/blockSize_100000/level_3-8 2.81GB/s ± 1% 2.84GB/s ± 3% ~ (p=0.556 n=4+5) Compress/blockSize_100000/level_5-8 1.12GB/s ± 1% 1.16GB/s ± 1% +3.83% (p=0.016 n=4+5) Compress/blockSize_100000/level_10-8 217MB/s ± 0% 218MB/s ± 1% ~ (p=0.190 n=4+5) Compress/blockSize_300000/level_3-8 2.43GB/s ± 1% 2.46GB/s ± 1% +1.11% (p=0.032 n=4+5) Compress/blockSize_300000/level_5-8 940MB/s ± 1% 932MB/s ± 3% ~ (p=0.556 n=4+5) Compress/blockSize_300000/level_10-8 278MB/s ± 1% 276MB/s ± 3% ~ (p=0.730 n=4+5) Decompress/blockSize_1/level_3-8 48.5MB/s ± 3% 42.1MB/s ± 3% -13.20% (p=0.016 n=4+5) Decompress/blockSize_1/level_5-8 48.3MB/s ± 2% 42.3MB/s ± 4% -12.48% (p=0.016 n=4+5) Decompress/blockSize_1/level_10-8 48.8MB/s ± 1% 43.1MB/s ± 4% -11.52% (p=0.016 n=4+5) Decompress/blockSize_10/level_3-8 473MB/s ± 3% 428MB/s ± 3% -9.56% (p=0.016 n=4+5) Decompress/blockSize_10/level_5-8 471MB/s ± 2% 421MB/s ± 4% -10.64% (p=0.016 n=4+5) Decompress/blockSize_10/level_10-8 484MB/s ± 3% 424MB/s ± 2% -12.44% (p=0.016 n=4+5) Decompress/blockSize_100/level_3-8 2.84GB/s ± 0% 2.50GB/s ± 0% -11.80% (p=0.016 n=4+5) Decompress/blockSize_100/level_5-8 2.81GB/s ± 0% 2.50GB/s ± 1% -11.14% (p=0.016 n=4+5) Decompress/blockSize_100/level_10-8 2.79GB/s ± 1% 2.49GB/s ± 1% -10.93% (p=0.016 n=4+5) Decompress/blockSize_1000/level_3-8 2.23GB/s ± 1% 2.20GB/s ± 0% -1.25% (p=0.016 n=4+5) Decompress/blockSize_1000/level_5-8 2.23GB/s ± 0% 2.19GB/s ± 1% -1.98% (p=0.016 n=4+5) Decompress/blockSize_1000/level_10-8 2.25GB/s ± 0% 2.13GB/s ± 3% -5.54% (p=0.016 n=4+5) Decompress/blockSize_10000/level_3-8 5.73GB/s ± 2% 5.74GB/s ± 1% ~ (p=1.000 n=4+5) Decompress/blockSize_10000/level_5-8 5.75GB/s ± 1% 5.74GB/s ± 0% ~ (p=1.000 n=4+5) Decompress/blockSize_10000/level_10-8 5.84GB/s ± 1% 5.79GB/s ± 1% ~ (p=0.063 n=4+5) Decompress/blockSize_100000/level_3-8 7.80GB/s ± 1% 7.98GB/s ± 0% +2.32% (p=0.029 n=4+4) Decompress/blockSize_100000/level_5-8 6.81GB/s ± 1% 7.00GB/s ± 1% +2.85% (p=0.016 n=4+5) Decompress/blockSize_100000/level_10-8 7.85GB/s ± 0% 8.01GB/s ± 0% +2.10% (p=0.016 n=4+5) Decompress/blockSize_300000/level_3-8 7.69GB/s ± 1% 7.94GB/s ± 2% +3.25% (p=0.016 n=4+5) Decompress/blockSize_300000/level_5-8 6.63GB/s ± 1% 6.86GB/s ± 0% +3.33% (p=0.016 n=4+5) Decompress/blockSize_300000/level_10-8 8.11GB/s ± 1% 8.19GB/s ± 2% ~ (p=0.111 n=4+5) ReaderDict/blockSize_1/level_3-8 155MB/s ± 1% 172MB/s ± 2% +10.67% (p=0.016 n=4+5) ReaderDict/blockSize_1/level_5-8 150MB/s ± 2% 174MB/s ± 1% +15.66% (p=0.016 n=4+5) ReaderDict/blockSize_1/level_10-8 142MB/s ± 0% 156MB/s ± 1% +9.99% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_3-8 1.36GB/s ± 1% 1.46GB/s ± 2% +7.14% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_5-8 1.39GB/s ± 0% 1.51GB/s ± 1% +8.16% (p=0.016 n=4+5) ReaderDict/blockSize_10/level_10-8 1.40GB/s ± 3% 1.54GB/s ± 1% +10.13% (p=0.016 n=4+5) ReaderDict/blockSize_100/level_3-8 4.58GB/s ± 1% 4.73GB/s ± 1% +3.26% (p=0.016 n=4+5) ReaderDict/blockSize_100/level_5-8 2.16GB/s ± 1% 2.16GB/s ± 1% ~ (p=0.556 n=4+5) ReaderDict/blockSize_100/level_10-8 2.18GB/s ± 1% 2.19GB/s ± 1% ~ (p=0.190 n=4+5) ReaderDict/blockSize_1000/level_3-8 6.18GB/s ± 0% 6.31GB/s ± 1% +2.23% (p=0.016 n=4+5) ReaderDict/blockSize_1000/level_5-8 5.42GB/s ± 1% 5.61GB/s ± 1% +3.40% (p=0.016 n=4+5) ReaderDict/blockSize_1000/level_10-8 5.79GB/s ± 1% 5.85GB/s ± 7% ~ (p=0.190 n=4+5) ReaderDict/blockSize_10000/level_3-8 6.73GB/s ± 1% 6.91GB/s ± 2% +2.71% (p=0.016 n=4+5) ReaderDict/blockSize_10000/level_5-8 6.77GB/s ± 1% 6.93GB/s ± 2% +2.42% (p=0.016 n=4+5) ReaderDict/blockSize_10000/level_10-8 7.29GB/s ± 0% 7.56GB/s ± 2% +3.76% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_3-8 6.73GB/s ± 2% 7.23GB/s ± 1% +7.45% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_5-8 6.30GB/s ± 2% 6.69GB/s ± 1% +6.08% (p=0.016 n=4+5) ReaderDict/blockSize_100000/level_10-8 7.66GB/s ± 1% 8.13GB/s ± 1% +6.14% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_3-8 5.87GB/s ± 1% 6.26GB/s ± 2% +6.72% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_5-8 5.65GB/s ± 1% 6.07GB/s ± 1% +7.45% (p=0.016 n=4+5) ReaderDict/blockSize_300000/level_10-8 7.22GB/s ± 1% 7.57GB/s ± 1% +4.79% (p=0.016 n=4+5) Reader/blockSize_1/level_3-8 157MB/s ± 1% 177MB/s ± 2% +12.33% (p=0.016 n=4+5) Reader/blockSize_1/level_5-8 156MB/s ± 1% 175MB/s ± 1% +12.39% (p=0.016 n=4+5) Reader/blockSize_1/level_10-8 151MB/s ± 4% 174MB/s ± 1% +15.04% (p=0.016 n=4+5) Reader/blockSize_10/level_3-8 1.20GB/s ± 9% 1.33GB/s ± 1% +10.71% (p=0.016 n=4+5) Reader/blockSize_10/level_5-8 1.23GB/s ± 1% 1.32GB/s ± 3% +7.33% (p=0.016 n=4+5) Reader/blockSize_10/level_10-8 1.20GB/s ± 4% 1.31GB/s ± 1% +9.30% (p=0.016 n=4+5) Reader/blockSize_100/level_3-8 2.02GB/s ± 1% 2.03GB/s ± 1% ~ (p=0.190 n=4+5) Reader/blockSize_100/level_5-8 2.02GB/s ± 1% 2.04GB/s ± 1% ~ (p=0.111 n=4+5) Reader/blockSize_100/level_10-8 2.04GB/s ± 1% 2.05GB/s ± 0% +0.71% (p=0.016 n=4+5) Reader/blockSize_1000/level_3-8 5.36GB/s ± 0% 5.49GB/s ± 1% +2.54% (p=0.016 n=4+5) Reader/blockSize_1000/level_5-8 5.35GB/s ± 0% 5.47GB/s ± 0% +2.25% (p=0.016 n=4+5) Reader/blockSize_1000/level_10-8 5.41GB/s ± 0% 5.56GB/s ± 1% +2.70% (p=0.016 n=4+5) Reader/blockSize_10000/level_3-8 7.22GB/s ± 1% 7.35GB/s ± 2% ~ (p=0.190 n=4+5) Reader/blockSize_10000/level_5-8 6.30GB/s ± 2% 6.44GB/s ± 2% ~ (p=0.190 n=4+5) Reader/blockSize_10000/level_10-8 7.27GB/s ± 1% 7.40GB/s ± 2% ~ (p=0.111 n=4+5) Reader/blockSize_100000/level_3-8 6.73GB/s ± 1% 7.10GB/s ± 3% +5.54% (p=0.016 n=4+5) Reader/blockSize_100000/level_5-8 6.18GB/s ± 1% 6.55GB/s ± 2% +6.05% (p=0.016 n=4+5) Reader/blockSize_100000/level_10-8 7.56GB/s ± 2% 7.98GB/s ± 1% +5.49% (p=0.016 n=4+5) Reader/blockSize_300000/level_3-8 5.84GB/s ± 1% 6.25GB/s ± 1% +7.08% (p=0.016 n=4+5) Reader/blockSize_300000/level_5-8 5.60GB/s ± 2% 5.97GB/s ± 1% +6.67% (p=0.016 n=4+5) Reader/blockSize_300000/level_10-8 7.12GB/s ± 1% 7.45GB/s ± 4% ~ (p=0.063 n=4+5) StreamCompress/blockSize_1/level_3-8 1.57MB/s ± 1% 1.59MB/s ± 0% ~ (p=0.429 n=4+4) StreamCompress/blockSize_1/level_5-8 210kB/s ± 0% 210kB/s ± 0% ~ (all equal) StreamCompress/blockSize_1/level_10-8 30.0kB/s ± 0% 30.0kB/s ± 0% ~ (all equal) StreamCompress/blockSize_10/level_3-8 16.2MB/s ± 0% 16.1MB/s ± 0% -0.58% (p=0.048 n=4+5) StreamCompress/blockSize_10/level_5-8 2.10MB/s ± 0% 2.07MB/s ± 1% -1.62% (p=0.016 n=4+5) StreamCompress/blockSize_10/level_10-8 268kB/s ± 3% 260kB/s ± 0% ~ (p=0.238 n=4+5) StreamCompress/blockSize_100/level_3-8 177MB/s ± 2% 177MB/s ± 1% ~ (p=1.000 n=4+5) StreamCompress/blockSize_100/level_5-8 21.5MB/s ± 2% 21.6MB/s ± 1% ~ (p=0.778 n=4+5) StreamCompress/blockSize_100/level_10-8 2.64MB/s ± 1% 2.64MB/s ± 1% ~ (p=1.000 n=4+5) StreamCompress/blockSize_1000/level_3-8 1.85GB/s ± 2% 1.85GB/s ± 1% ~ (p=0.413 n=4+5) StreamCompress/blockSize_1000/level_5-8 277MB/s ± 3% 273MB/s ± 4% ~ (p=0.556 n=4+5) StreamCompress/blockSize_1000/level_10-8 25.7MB/s ± 1% 24.7MB/s ± 3% -4.07% (p=0.016 n=4+5) StreamCompress/blockSize_10000/level_3-8 2.57GB/s ± 2% 2.68GB/s ± 1% +4.25% (p=0.016 n=4+5) StreamCompress/blockSize_10000/level_5-8 892MB/s ± 2% 902MB/s ± 3% ~ (p=0.413 n=4+5) StreamCompress/blockSize_10000/level_10-8 141MB/s ± 1% 141MB/s ± 2% ~ (p=0.413 n=4+5) StreamCompress/blockSize_100000/level_3-8 1.87GB/s ± 1% 1.95GB/s ± 2% +4.53% (p=0.016 n=4+5) StreamCompress/blockSize_100000/level_5-8 685MB/s ± 2% 710MB/s ± 2% +3.56% (p=0.016 n=4+5) StreamCompress/blockSize_100000/level_10-8 199MB/s ± 2% 199MB/s ± 1% ~ (p=0.905 n=4+5) StreamCompress/blockSize_300000/level_3-8 1.59GB/s ± 1% 1.65GB/s ± 1% +3.24% (p=0.016 n=4+5) StreamCompress/blockSize_300000/level_5-8 533MB/s ± 3% 542MB/s ± 3% ~ (p=0.111 n=4+5) StreamCompress/blockSize_300000/level_10-8 164MB/s ± 9% 169MB/s ± 1% ~ (p=0.730 n=4+5) StreamDecompress/blockSize_1/level_3-8 146MB/s ± 3% 174MB/s ± 1% +19.58% (p=0.016 n=4+5) StreamDecompress/blockSize_1/level_5-8 141MB/s ± 5% 165MB/s ± 8% +17.09% (p=0.016 n=4+5) StreamDecompress/blockSize_1/level_10-8 140MB/s ± 6% 164MB/s ±12% +17.43% (p=0.016 n=4+5) StreamDecompress/blockSize_10/level_3-8 1.15GB/s ± 5% 1.23GB/s ±10% ~ (p=0.286 n=4+5) StreamDecompress/blockSize_10/level_5-8 1.13GB/s ± 6% 1.30GB/s ± 1% +15.44% (p=0.029 n=4+4) StreamDecompress/blockSize_10/level_10-8 1.14GB/s ± 4% 1.25GB/s ± 6% ~ (p=0.063 n=4+5) StreamDecompress/blockSize_100/level_3-8 2.00GB/s ± 1% 2.04GB/s ± 0% +2.26% (p=0.016 n=4+5) StreamDecompress/blockSize_100/level_5-8 2.03GB/s ± 0% 2.04GB/s ± 1% ~ (p=0.413 n=4+5) StreamDecompress/blockSize_100/level_10-8 2.05GB/s ± 1% 2.05GB/s ± 1% ~ (p=1.000 n=4+5) StreamDecompress/blockSize_1000/level_3-8 5.38GB/s ± 1% 5.50GB/s ± 0% +2.19% (p=0.029 n=4+4) StreamDecompress/blockSize_1000/level_5-8 5.38GB/s ± 0% 5.44GB/s ± 1% +1.20% (p=0.032 n=4+5) StreamDecompress/blockSize_1000/level_10-8 5.44GB/s ± 1% 5.47GB/s ± 1% ~ (p=0.286 n=4+5) StreamDecompress/blockSize_10000/level_3-8 7.34GB/s ± 2% 7.32GB/s ± 3% ~ (p=0.730 n=4+5) StreamDecompress/blockSize_10000/level_5-8 6.43GB/s ± 1% 6.61GB/s ± 1% +2.78% (p=0.016 n=4+5) StreamDecompress/blockSize_10000/level_10-8 7.48GB/s ± 0% 7.58GB/s ± 1% +1.44% (p=0.016 n=4+5) StreamDecompress/blockSize_100000/level_3-8 6.95GB/s ± 1% 7.11GB/s ± 1% +2.35% (p=0.016 n=4+5) StreamDecompress/blockSize_100000/level_5-8 6.38GB/s ± 1% 6.52GB/s ± 2% ~ (p=0.063 n=4+5) StreamDecompress/blockSize_100000/level_10-8 7.80GB/s ± 1% 7.94GB/s ± 1% +1.75% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_3-8 6.09GB/s ± 0% 6.22GB/s ± 1% +2.02% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_5-8 5.88GB/s ± 0% 5.96GB/s ± 1% +1.39% (p=0.016 n=4+5) StreamDecompress/blockSize_300000/level_10-8 7.42GB/s ± 0% 7.53GB/s ± 1% +1.47% (p=0.016 n=4+5) WriterDict/blockSize_1/level_3-8 39.2MB/s ± 1% 37.8MB/s ± 1% -3.60% (p=0.016 n=4+5) WriterDict/blockSize_1/level_5-8 33.7MB/s ± 1% 32.4MB/s ± 2% -3.87% (p=0.016 n=4+5) WriterDict/blockSize_1/level_10-8 33.1MB/s ± 1% 31.7MB/s ± 1% -4.09% (p=0.016 n=4+5) WriterDict/blockSize_10/level_3-8 555MB/s ± 1% 524MB/s ± 0% -5.64% (p=0.016 n=4+5) WriterDict/blockSize_10/level_5-8 307MB/s ± 1% 299MB/s ± 1% -2.59% (p=0.016 n=4+5) WriterDict/blockSize_10/level_10-8 320MB/s ± 2% 315MB/s ± 1% ~ (p=0.190 n=4+5) WriterDict/blockSize_100/level_3-8 1.99GB/s ± 2% 1.96GB/s ± 1% ~ (p=0.111 n=4+5) WriterDict/blockSize_100/level_5-8 685MB/s ± 1% 693MB/s ± 1% +1.19% (p=0.032 n=4+5) WriterDict/blockSize_100/level_10-8 386MB/s ± 1% 395MB/s ± 1% +2.34% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_3-8 2.18GB/s ± 1% 2.25GB/s ± 2% +3.12% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_5-8 864MB/s ± 2% 905MB/s ± 1% +4.76% (p=0.016 n=4+5) WriterDict/blockSize_1000/level_10-8 366MB/s ± 1% 383MB/s ± 1% +4.77% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_3-8 2.51GB/s ± 0% 2.63GB/s ± 1% +4.58% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_5-8 864MB/s ± 2% 908MB/s ± 1% +5.12% (p=0.016 n=4+5) WriterDict/blockSize_10000/level_10-8 410MB/s ± 2% 430MB/s ± 2% +4.95% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_3-8 2.45GB/s ± 2% 2.54GB/s ± 0% +3.84% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_5-8 946MB/s ± 2% 982MB/s ± 1% +3.76% (p=0.016 n=4+5) WriterDict/blockSize_100000/level_10-8 443MB/s ± 3% 462MB/s ± 2% +4.10% (p=0.032 n=4+5) WriterDict/blockSize_300000/level_3-8 2.38GB/s ± 1% 2.45GB/s ± 2% +2.96% (p=0.016 n=4+5) WriterDict/blockSize_300000/level_5-8 870MB/s ± 3% 922MB/s ± 3% +5.95% (p=0.016 n=4+5) WriterDict/blockSize_300000/level_10-8 410MB/s ± 7% 437MB/s ± 8% ~ (p=0.190 n=4+5) Writer/blockSize_1/level_3-8 1.58MB/s ± 1% 1.59MB/s ± 0% ~ (p=0.143 n=4+4) Writer/blockSize_1/level_5-8 202kB/s ± 4% 210kB/s ± 0% ~ (p=0.143 n=4+4) Writer/blockSize_1/level_10-8 27.5kB/s ±27% 30.0kB/s ± 0% ~ (p=0.889 n=4+5) Writer/blockSize_10/level_3-8 16.0MB/s ± 1% 15.9MB/s ± 3% ~ (p=0.730 n=4+5) Writer/blockSize_10/level_5-8 2.08MB/s ± 3% 2.04MB/s ± 3% ~ (p=0.730 n=4+5) Writer/blockSize_10/level_10-8 260kB/s ± 0% 260kB/s ± 0% ~ (all equal) Writer/blockSize_100/level_3-8 176MB/s ± 2% 178MB/s ± 2% ~ (p=0.286 n=4+5) Writer/blockSize_100/level_5-8 21.6MB/s ± 2% 21.5MB/s ± 0% ~ (p=0.254 n=4+5) Writer/blockSize_100/level_10-8 2.60MB/s ± 1% 2.61MB/s ± 1% ~ (p=0.603 n=4+5) Writer/blockSize_1000/level_3-8 1.83GB/s ± 3% 1.86GB/s ± 2% ~ (p=0.190 n=4+5) Writer/blockSize_1000/level_5-8 283MB/s ± 1% 270MB/s ± 3% -4.31% (p=0.016 n=4+5) Writer/blockSize_1000/level_10-8 25.2MB/s ± 1% 25.1MB/s ± 1% ~ (p=0.730 n=4+5) Writer/blockSize_10000/level_3-8 2.55GB/s ± 2% 2.61GB/s ± 2% ~ (p=0.063 n=4+5) Writer/blockSize_10000/level_5-8 872MB/s ± 2% 890MB/s ± 4% ~ (p=0.190 n=4+5) Writer/blockSize_10000/level_10-8 139MB/s ± 0% 140MB/s ± 2% ~ (p=0.286 n=4+5) Writer/blockSize_100000/level_3-8 1.86GB/s ± 3% 1.90GB/s ± 1% ~ (p=0.286 n=4+5) Writer/blockSize_100000/level_5-8 683MB/s ± 2% 689MB/s ± 4% ~ (p=0.413 n=4+5) Writer/blockSize_100000/level_10-8 199MB/s ± 2% 196MB/s ± 3% ~ (p=0.286 n=4+5) Writer/blockSize_300000/level_3-8 1.57GB/s ± 2% 1.59GB/s ± 1% ~ (p=0.111 n=4+5) Writer/blockSize_300000/level_5-8 541MB/s ± 1% 546MB/s ± 2% ~ (p=0.413 n=4+5) Writer/blockSize_300000/level_10-8 165MB/s ± 2% 166MB/s ± 2% ~ (p=0.111 n=4+5) ```

Now you will notice that many results (particularly ones working with tiny buffers) are reporting being up to 20% slower, turns out this is because the CGO pointers checks are now taking significant amount of time, then again we're talking a few nanoseconds and this is completely negligible with larger buffers, so IMO this isn't that bad. I've also made the Reader write directly into the provided buffer (if it's large enough), and those benchmarks show the biggest gain - about 5% faster when using large buffers. The ability to use the go slice directly could be also used in the Writer, but let's leave that for another PR.

I've also re-run the benchmarks with GODEBUG=cgocheck=0, and the results definitely look even better:

Had to use a gist, github didn't like this long PR description - https://gist.github.com/mhr3/84f58f62353ef3b9db30288df00fa2b3