cleanWithoutTrim creates in-memory copies
of the contents of every ast.Text in a document.
This causes a fair bit of unnecessary allocation.
This replaces cleanWithoutTrim with a streaming writer
that implements the equivalent behavior
but without copying byte slices in-memory.
Instead, it writes directly to the destination Writer.
To verify parity of the implementation,
this retains cleanWithoutTrim in a test file,
fuzzes both implementations together,
and compares their outputs.
You can checkout this PR and try it out yourself:
go test -run '^$' -fuzz . -v github.com/Kunde21/markdownfmt/v2/markdown
Results
This PR also adds a benchmark
that measures the cost of rendering
all input files inside testfiles/.
Allocations are down, and performance is up across the board.
Note that I'm running these benchmarks on a pretty low-end machine,
so the CPU time is higher than it would normally be:
goos: linux
goarch: amd64
pkg: github.com/Kunde21/markdownfmt/v2/markdownfmt
cpu: Intel(R) Celeron(R) N4020 CPU @ 1.10GHz
To run the benchmark locally,
checkout this PR and run:
IMPORTPATH=github.com/Kunde21/markdownfmt/v2/markdownfmt
git checkout HEAD~ &&
go test -run '^$' -bench . -v -benchmem -count 5 $IMPORTPATH | tee before.txt &&
git checkout - &&
go test -run '^$' -bench . -v -benchmem -count 5 $IMPORTPATH | tee after.txt &&
benchstat before.txt after.txt
You'll need to install benchstat to generate the final result first:
cleanWithoutTrim creates in-memory copies of the contents of every ast.Text in a document. This causes a fair bit of unnecessary allocation.
This replaces cleanWithoutTrim with a streaming writer that implements the equivalent behavior but without copying byte slices in-memory. Instead, it writes directly to the destination Writer.
To verify parity of the implementation, this retains cleanWithoutTrim in a test file, fuzzes both implementations together, and compares their outputs. You can checkout this PR and try it out yourself:
Results
This PR also adds a benchmark that measures the cost of rendering all input files inside testfiles/.
The benchstat before/after this change is:
Allocations are down, and performance is up across the board. Note that I'm running these benchmarks on a pretty low-end machine, so the CPU time is higher than it would normally be:
To run the benchmark locally, checkout this PR and run:
You'll need to install benchstat to generate the final result first: