buildkite / terminal-to-html

Converts arbitrary shell output (with ANSI) into beautifully rendered HTML
http://buildkite.github.io/terminal-to-html
MIT License
642 stars 45 forks source link

Pack style into uint32 #121

Closed DrJosh9000 closed 7 months ago

DrJosh9000 commented 7 months ago

The old node consists of the rune (blob), which is an int32, plus two pointers. On a 32-bit platform these align nicely. On 64-bit, the compiler typically wastes 4 bytes on padding for alignment.

Storing style as a packed int, and replacing *style with style, has a few benefits:

Also it's faster and uses less memory:

goos: linux
goarch: amd64
pkg: github.com/buildkite/terminal-to-html/v3
cpu: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
                      │ baseline.txt │             packed.txt              │
                      │    sec/op    │   sec/op     vs base                │
RendererControl-12       3.631µ ± 1%   2.898µ ± 1%  -20.19% (p=0.000 n=30)
RendererCurl-12          50.02µ ± 0%   46.34µ ± 0%   -7.34% (p=0.000 n=30)
RendererHomer-12        104.55µ ± 0%   90.86µ ± 0%  -13.09% (p=0.000 n=30)
RendererDockerPull-12    181.2µ ± 0%   167.9µ ± 0%   -7.37% (p=0.000 n=30)
RendererPikachu-12       5.030m ± 0%   4.766m ± 0%   -5.26% (p=0.000 n=30)
RendererNpm-12           80.98m ± 2%   81.00m ± 1%        ~ (p=0.697 n=30)
geomean                  334.5µ        304.1µ        -9.10%

                      │ baseline.txt  │              packed.txt              │
                      │     B/op      │     B/op      vs base                │
RendererControl-12       2.227Ki ± 0%   1.477Ki ± 0%  -33.68% (p=0.000 n=30)
RendererCurl-12         10.766Ki ± 0%   7.750Ki ± 0%  -28.01% (p=0.000 n=30)
RendererHomer-12         46.91Ki ± 0%   33.39Ki ± 0%  -28.81% (p=0.000 n=30)
RendererDockerPull-12    48.41Ki ± 0%   33.66Ki ± 0%  -30.47% (p=0.000 n=30)
RendererPikachu-12       767.6Ki ± 0%   652.4Ki ± 0%  -15.01% (p=0.000 n=30)
RendererNpm-12           24.26Mi ± 0%   18.44Mi ± 0%  -23.99% (p=0.000 n=30)
geomean                  100.6Ki        73.56Ki       -26.89%

                      │ baseline.txt │              packed.txt               │
                      │  allocs/op   │  allocs/op   vs base                  │
RendererControl-12        9.000 ± 0%    9.000 ± 0%        ~ (p=1.000 n=30) ¹
RendererCurl-12           36.00 ± 0%    35.00 ± 0%   -2.78% (p=0.000 n=30)
RendererHomer-12          134.0 ± 0%    133.0 ± 0%   -0.75% (p=0.000 n=30)
RendererDockerPull-12     193.0 ± 0%    193.0 ± 0%        ~ (p=1.000 n=30) ¹
RendererPikachu-12       16.16k ± 0%   14.35k ± 0%  -11.23% (p=0.000 n=30)
RendererNpm-12           183.2k ± 0%   158.9k ± 0%  -13.27% (p=0.000 n=30)
geomean                   540.1         514.0        -4.83%
¹ all samples are equal
goos: darwin
goarch: arm64
pkg: github.com/buildkite/terminal-to-html/v3
                      │ baseline-darwin-arm64.txt │       packed-darwin-arm64.txt       │
                      │          sec/op           │   sec/op     vs base                │
RendererControl-10                    620.6n ± 5%   556.3n ± 1%  -10.35% (p=0.000 n=30)
RendererCurl-10                       8.306µ ± 0%   7.644µ ± 0%   -7.98% (p=0.000 n=30)
RendererHomer-10                      19.65µ ± 0%   16.55µ ± 1%  -15.81% (p=0.000 n=30)
RendererDockerPull-10                 32.65µ ± 0%   28.96µ ± 2%  -11.30% (p=0.000 n=30)
RendererPikachu-10                    739.5µ ± 0%   682.2µ ± 1%   -7.75% (p=0.000 n=30)
RendererNpm-10                        15.24m ± 0%   13.43m ± 0%  -11.85% (p=0.000 n=30)
geomean                               57.79µ        51.51µ       -10.88%

                      │ baseline-darwin-arm64.txt │       packed-darwin-arm64.txt        │
                      │           B/op            │     B/op      vs base                │
RendererControl-10                   2.227Ki ± 0%   1.477Ki ± 0%  -33.68% (p=0.000 n=30)
RendererCurl-10                     10.766Ki ± 0%   7.750Ki ± 0%  -28.01% (p=0.000 n=30)
RendererHomer-10                     46.91Ki ± 0%   33.39Ki ± 0%  -28.81% (p=0.000 n=30)
RendererDockerPull-10                48.41Ki ± 0%   33.66Ki ± 0%  -30.47% (p=0.000 n=30)
RendererPikachu-10                   767.6Ki ± 0%   652.4Ki ± 0%  -15.01% (p=0.000 n=30)
RendererNpm-10                       24.26Mi ± 0%   18.44Mi ± 0%  -23.99% (p=0.000 n=30)
geomean                              100.6Ki        73.56Ki       -26.89%

                      │ baseline-darwin-arm64.txt │        packed-darwin-arm64.txt        │
                      │         allocs/op         │  allocs/op   vs base                  │
RendererControl-10                     9.000 ± 0%    9.000 ± 0%        ~ (p=1.000 n=30) ¹
RendererCurl-10                        36.00 ± 0%    35.00 ± 0%   -2.78% (p=0.000 n=30)
RendererHomer-10                       134.0 ± 0%    133.0 ± 0%   -0.75% (p=0.000 n=30)
RendererDockerPull-10                  193.0 ± 0%    193.0 ± 0%        ~ (p=1.000 n=30) ¹
RendererPikachu-10                    16.16k ± 0%   14.35k ± 0%  -11.23% (p=0.000 n=30)
RendererNpm-10                        183.2k ± 0%   158.9k ± 0%  -13.27% (p=0.000 n=30)
geomean                                540.1         514.0        -4.83%
¹ all samples are equal
goos: darwin
goarch: amd64
pkg: github.com/buildkite/terminal-to-html/v3
cpu: Intel(R) Xeon(R) W-2191B CPU @ 2.30GHz
                      │ baseline.txt │             packed.txt              │
                      │    sec/op    │   sec/op     vs base                │
RendererControl-36       1.606µ ± 0%   1.172µ ± 1%  -27.00% (p=0.000 n=30)
RendererCurl-36          14.29µ ± 0%   12.52µ ± 0%  -12.36% (p=0.000 n=30)
RendererHomer-36         40.22µ ± 0%   32.54µ ± 0%  -19.09% (p=0.000 n=30)
RendererDockerPull-36    54.76µ ± 0%   47.35µ ± 0%  -13.52% (p=0.000 n=30)
RendererPikachu-36       1.312m ± 0%   1.212m ± 0%   -7.64% (p=0.000 n=30)
RendererNpm-36           23.35m ± 1%   21.53m ± 0%   -7.76% (p=0.000 n=30)
geomean                  107.5µ        91.58µ       -14.84%

                      │ baseline.txt  │              packed.txt              │
                      │     B/op      │     B/op      vs base                │
RendererControl-36       2.227Ki ± 0%   1.477Ki ± 0%  -33.68% (p=0.000 n=30)
RendererCurl-36         10.766Ki ± 0%   7.750Ki ± 0%  -28.01% (p=0.000 n=30)
RendererHomer-36         46.91Ki ± 0%   33.39Ki ± 0%  -28.81% (p=0.000 n=30)
RendererDockerPull-36    48.41Ki ± 0%   33.66Ki ± 0%  -30.47% (p=0.000 n=30)
RendererPikachu-36       767.6Ki ± 0%   652.4Ki ± 0%  -15.01% (p=0.000 n=30)
RendererNpm-36           24.26Mi ± 0%   18.44Mi ± 0%  -23.99% (p=0.000 n=30)
geomean                  100.6Ki        73.56Ki       -26.89%

                      │ baseline.txt │              packed.txt               │
                      │  allocs/op   │  allocs/op   vs base                  │
RendererControl-36        9.000 ± 0%    9.000 ± 0%        ~ (p=1.000 n=30) ¹
RendererCurl-36           36.00 ± 0%    35.00 ± 0%   -2.78% (p=0.000 n=30)
RendererHomer-36          134.0 ± 0%    133.0 ± 0%   -0.75% (p=0.000 n=30)
RendererDockerPull-36     193.0 ± 0%    193.0 ± 0%        ~ (p=1.000 n=30) ¹
RendererPikachu-36       16.16k ± 0%   14.35k ± 0%  -11.23% (p=0.000 n=30)
RendererNpm-36           183.2k ± 0%   158.9k ± 0%  -13.27% (p=0.000 n=30)
geomean                   540.1         514.0        -4.83%
¹ all samples are equal
DrJosh9000 commented 6 months ago

I ran the benchmarks on another computer I had handy, here they are for posterity. (Linux under WSL)

goos: linux
goarch: amd64
pkg: github.com/buildkite/terminal-to-html/v3
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
                      │ baseline-linux-amd64.txt │       packed-linux-amd64.txt        │
                      │          sec/op          │   sec/op     vs base                │
RendererControl-24                   443.6n ± 1%   369.8n ± 1%  -16.66% (p=0.000 n=30)
RendererCurl-24                      5.344µ ± 0%   5.641µ ± 0%   +5.57% (p=0.000 n=30)
RendererHomer-24                     12.22µ ± 0%   11.39µ ± 0%   -6.76% (p=0.000 n=30)
RendererDockerPull-24                20.13µ ± 0%   20.15µ ± 0%        ~ (p=0.188 n=30)
RendererPikachu-24                   538.5µ ± 0%   516.8µ ± 0%   -4.02% (p=0.000 n=30)
RendererNpm-24                       12.83m ± 1%   11.38m ± 1%  -11.35% (p=0.000 n=30)
geomean                              39.89µ        37.58µ        -5.80%

                      │ baseline-linux-amd64.txt │        packed-linux-amd64.txt        │
                      │           B/op           │     B/op      vs base                │
RendererControl-24                  2.227Ki ± 0%   1.492Ki ± 0%  -32.98% (p=0.000 n=30)
RendererCurl-24                    10.766Ki ± 0%   7.766Ki ± 0%  -27.87% (p=0.000 n=30)
RendererHomer-24                    46.91Ki ± 0%   33.41Ki ± 0%  -28.78% (p=0.000 n=30)
RendererDockerPull-24               48.41Ki ± 0%   33.67Ki ± 0%  -30.44% (p=0.000 n=30)
RendererPikachu-24                  767.6Ki ± 0%   652.4Ki ± 0%  -15.01% (p=0.000 n=30)
RendererNpm-24                      24.26Mi ± 0%   18.44Mi ± 0%  -23.99% (p=0.000 n=30)
geomean                             100.6Ki        73.73Ki       -26.73%

                      │ baseline-linux-amd64.txt │        packed-linux-amd64.txt         │
                      │        allocs/op         │  allocs/op   vs base                  │
RendererControl-24                    9.000 ± 0%    9.000 ± 0%        ~ (p=1.000 n=30) ¹
RendererCurl-24                       36.00 ± 0%    35.00 ± 0%   -2.78% (p=0.000 n=30)
RendererHomer-24                      134.0 ± 0%    133.0 ± 0%   -0.75% (p=0.000 n=30)
RendererDockerPull-24                 193.0 ± 0%    193.0 ± 0%        ~ (p=1.000 n=30) ¹
RendererPikachu-24                   16.16k ± 0%   14.35k ± 0%  -11.23% (p=0.000 n=30)
RendererNpm-24                       183.2k ± 0%   158.9k ± 0%  -13.27% (p=0.000 n=30)
geomean                               540.1         514.0        -4.83%
¹ all samples are equal