Open pavlov99 opened 4 years ago
This was implemented in https://github.com/slothai/tabtools/commit/678e72894b1fb7b0120cbb07e05d65a369bce884 commit.
Performance comparison: python implementation vs awk implementation.
Compare head of the file (common pretty printing case), 350 lines files and >1k lines file.
Python vesion
cat file | time -f '%es' .env/bin/python -c 'from tabtools.scripts import *; ttpretty()'
Awk version
cat file | time -f '%es' ./bin/ttpretty
File | Python Code | Awk Code |
---|---|---|
7 columns, 10 rows (head) | 0.04s | 0.01s |
7 columns, 338 rows | 0.05s | 0.02s |
14 columns, 10k rows | 0.40s | 0.72s |
After script rewriting, it outperforms python version: | File | Python Code | Awk Code |
---|---|---|---|
7 columns, 10 rows (head) | 0.04s | 0.01s | |
7 columns, 338 rows | 0.05s | 0.02s | |
14 columns, 10k rows | 0.42s | 0.38s |
At the moment pretty table print is implemented in python. The program reads the whole input twice: one to calculate column widths and the second time to actually print.
Is it possible to implement such functionality in bash with reasonable limitations (e.g. header manipulation)?
See https://superuser.com/questions/557256/reading-the-same-stdin-with-two-commands-in-bash
https://stackoverflow.com/questions/10218103/os-x-linux-pipe-into-two-processes