noahgorstein / jqp

A TUI playground to experiment with jq
MIT License
2.19k stars 40 forks source link

fix: dynamically increase buffer size to handle processing large JSON lines #93

Closed noahgorstein closed 4 months ago

noahgorstein commented 4 months ago

Per the bufio docs: https://pkg.go.dev/bufio#Scanner.Buffer:

Buffer sets the initial buffer to use when scanning and the maximum size of buffer that may be allocated during scanning. The maximum token size must be less than the larger of max and cap(buf). If max <= cap(buf), Scanner.Scan will use this buffer only and do no allocation.

By default, Scanner.Scan uses an internal buffer and sets the maximum token size to MaxScanTokenSize. ...

This became problematic for us when I introduced code to handle NDJSON (JSON lines) as input. If one of the lines was greater than 64KB (MaxScanTokenSize) or if the JSON was minified such that it was all on one line, we would run into an issue scanning each line because the maximum buffer used for reading would not have enough capacity.

This PR will still attempt to use a 64KB buffer to process each line but will keep retrying if the buffer is not large enough. Each retry will double the buffer size. The max buffer size is 100MB which is somewhat arbitrarily large but jqp will fail to be performant at this scale anyway due to syntax highlighting and writing large input/output to viewports so will keep it there for now.