Open 9ao9ai9ar opened 3 weeks ago
Compiling without llvm probably makes it a lot smaller. My frawk binary is 9.4M stripped and 17M unstripped.
I rarely see mawk
being much faster than frawk
(except for https://github.com/ezrosent/frawk/issues/98).
Although I guess it might be possible if you test on small files (as the overhead of compiling the awk script in case of frawk
might be most of the runtime).
Do you have some example scripts?
How to build frawk without LLVM? Like this?
cargo +nightly install --path . --no-default-features
I've used frawk with all optimization levels and backends in my scripts in this repo (invitation sent), and with the exception of rg3.sh
where it runs faster with frawk, but only on modern hardware (SSD instead of HDD), all versions are consistenly at least a second or two slower with frawk compared to mawk, so definitely a noticeable margin. (You could either define a function mawk in benchmark.sh
that calls frawk to make an in-place override to compare the results, or make a copy of each script in solutions
that uses frawk instead of mawk/awk.) I haven't tried the comparison with the LumbrasGigaBase dataset though, where the files are much fewer in number and larger in size.
# Without LLVM, but with other recommended defaults
$ cargo +nightly install --path . --no-default-features --features use_jemalloc,allow_avx2,unstable
frawk, when fully stripped, is still an order of magnitude larger than ripgrep, a similar Rust CLI program that is already an order of magnitude larger than traditional UNIX CLI tools. The large size gives the impression that the program is bloated, even more so when my benchmark shows that it is slower than mawk by some margin.