eBay / tsv-utils

eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
https://ebay.github.io/tsv-utils/
Boost Software License 1.0
1.42k stars 80 forks source link

tsv-append: limit number of rows per file? [feature request] #354

Open johann-petrak opened 2 years ago

johann-petrak commented 2 years ago

tsv-append is useful for combining several tsv files each with a header line.

However, very often one does this and also wants to combine only the top ki lines of the ith file (e.g. after all those files have been sorted by some criterion).

This can of course be in several steps but since tsv-append already exists, adding a way to do this with this command would make it easy to do this in one easy to understand step.

One way to implement this perhaps would be to make source tracking with -f optional and allow to enable "top-n" processing:

So whenever -T/--topn is specified, if a file ends in ":[0-9]+" then this suffix is used to specify the number of top data rows to include (maximally, if the file is shorter, include everything there is).