Closed DrJosh9000 closed 2 months ago
Is my understanding that this would effectively limit the maximum memory used by t2html - meaning the limit is determined by the maxLines, not by the size of the input? Have you tried putting a bigger input file through the benchmarks to see what that looks like? e.g. 5mb compared to 0.5mb
That is the goal, to limit the memory needed to process larger and larger input. In practice it will still allocate quite a lot, but should garbage collect more of it.
Here's the same benchmark on a 17MiB file (several copies of fixtures/npm.sh.raw):
BenchmarkRendererBig-10 3 356336375 ns/op 392413698 B/op 5244420 allocs/op
BenchmarkStreamingBig-10 4 322766854 ns/op 166112616 B/op 5042078 allocs/op
166 MB of memory allocations per run still sounds alarming, but that number is "total" - it doesn't decrease as the garbage collector frees memory. If I stick a little diagnostic code at the end of parseToScreen
to see what's up:
var m runtime.MemStats
runtime.ReadMemStats(&m)
log.Printf("TotalAlloc: %d\tHeapAlloc: %d\tHeapInuse: %d", m.TotalAlloc, m.HeapAlloc, m.HeapInuse)
and do a single run with the input streamed into the parser (io.Copy
):
2024/01/18 11:26:16 TotalAlloc: 839816 HeapAlloc: 839816 HeapInuse: 1441792
2024/01/18 11:26:16 TotalAlloc: 1212656 HeapAlloc: 1212656 HeapInuse: 1761280
2024/01/18 11:26:16 TotalAlloc: 1558336 HeapAlloc: 1558336 HeapInuse: 2097152
...snip...
2024/01/18 11:26:17 TotalAlloc: 173126152 HeapAlloc: 1782400 HeapInuse: 2441216
2024/01/18 11:26:17 TotalAlloc: 173426456 HeapAlloc: 2082704 HeapInuse: 2727936
2024/01/18 11:26:17 TotalAlloc: 173618552 HeapAlloc: 2274800 HeapInuse: 2924544
HeapInuse
is the most useful, and hovered ± 1.7 MB around a mean of 3.1 MB.
The same thing without using MaxLines
:
2024/01/18 11:34:43 TotalAlloc: 836776 HeapAlloc: 836776 HeapInuse: 1359872
2024/01/18 11:34:43 TotalAlloc: 1222880 HeapAlloc: 1222880 HeapInuse: 1744896
2024/01/18 11:34:43 TotalAlloc: 1527456 HeapAlloc: 1527456 HeapInuse: 2072576
...snip...
2024/01/18 11:34:43 TotalAlloc: 248025120 HeapAlloc: 154157880 HeapInuse: 166273024
2024/01/18 11:34:43 TotalAlloc: 248375520 HeapAlloc: 154508280 HeapInuse: 166453248
2024/01/18 11:34:43 TotalAlloc: 248631216 HeapAlloc: 154763976 HeapInuse: 166600704
i.e. heap usage grows, does not shrink, and in the end has used 166 MB without being able to free it.
The goal
Allow terminal output to be processed in a streaming manner, at the expense of not buffering the full history of the input. Enable much larger input to be processed without exhausting memory.
Why
The memory improvements of the last couple of PRs (#121, #124) whittle the memory allocated per run for the NPM benchmark from ~24MiB down to ~14MiB, but on a ~0.5MiB input that's still a hefty blowup.
The overwhelming amount of typical terminal output writes to new lines. Sometimes a command on a PTY will erase the previous line (e.g.
git
). Some tools will rewrite the last 10 or so lines (bazel
,docker
). In all cases, sensible CLI tools should assume the terminal window is no larger than a few hundred lines.So why do we need to buffer more lines than that?
By streaming out all the lines we no longer need to touch, they don't need to be kept in memory, so they can be freed. The total memory needed to process input can be limited, rather than having to scale with the size of the input. This will enable displaying much larger logs.
How
The key functionality was added in #118; the main change is to tidy up the main edge case (preserving parser state between chunks of input) and default to using it. Pass
--buffer-max-lines=0
to disable the line limit. It applies to all modes: stdin, file, and the web service.But while I'm here...
strings.Builder
instead ofbytes.Buffer
, and avoid a few casts to[]byte
- it's slightly fasterstrings.Replace
that inserted
by generating
for every line that would have been empty.Show me the benchmarks!
The streaming benchmark doesn't exist before this PR, so I copied it into main temporarily (and fixed it so it would build). The speedups are mostly to do with
strings.Builder
and avoiding casts to[]byte
. The 46% memory saving before/after in the streaming benchmark is primarily due to the screen line recycling.