This PR completes basic support for line buffering in the toolkit. It is a follow-up to PRs #333, #334, and #335.
By default, tools read and write in a buffered mode where data is read and written in large blocks. This is a significant performance enhancement over reading and writing line-by-line. However, reading and writing each line as it becomes available is desirable when reading from live input streams having only occasional inputs.
Most tools now support a --line-buffered option that switches to line buffering mode. Tools supporting this are: number-lines, tsv-append, tsv-filter, tsv-join, tsv-sample, tsv-select, tsv-uniq.
This PR also cleaned up some code related to header line processing and stdout flushing. This results better error message processing in a few cases. (More timely error messages in unix pipelines; error messages written after all processed output has been flushed.)
This PR completes basic support for line buffering in the toolkit. It is a follow-up to PRs #333, #334, and #335.
By default, tools read and write in a buffered mode where data is read and written in large blocks. This is a significant performance enhancement over reading and writing line-by-line. However, reading and writing each line as it becomes available is desirable when reading from live input streams having only occasional inputs.
Most tools now support a
--line-buffered
option that switches to line buffering mode. Tools supporting this are:number-lines
,tsv-append
,tsv-filter
,tsv-join
,tsv-sample
,tsv-select
,tsv-uniq
.This PR also cleaned up some code related to header line processing and stdout flushing. This results better error message processing in a few cases. (More timely error messages in unix pipelines; error messages written after all processed output has been flushed.)