Closed mike239x closed 4 years ago
Thank you for the feedback.
I agree, it would be nice. But not more. I don't really see a problem with the current speed, as I don't think that performance (at the current level) is critical for a hexdump tool. hexyl
processes around 10 MiB of data per second. It outputs text to the screen at a speed that is much faster than terminal emulators can handle (in terminator, hexyl
is a factor of 5 slower when I write to the TTY).
In which real-world use case would we really need it to be faster?
I tried to find a "real-world use case" but failed. I would say it is an ideological thing... Something in lines "software shouldn't get slower with time, but faster".
I'll take a look into the source code in my free time, maybe (though unlikely) I'll find the way to improve it :)
"software shouldn't get slower with time, but faster"
I would agree. But hexyl
is about adding additional functionality (the colorized output). It's not trying to be a 1:1 replacement for xxd
.
Real-world use case -- I've got a pretty big file that's mostly zeroes, with a k or so of nonzero data. Reading from /tmp, hexyl
takes 55.791s, hexdump -C
takes 1.091s, xxd >/dev/null
takes 40.528s.
@remexre Thank you.
If someone wants to work on this, here is a reproducible benchmark (I'm using hyperfine
):
#!/bin/bash
dd if=/dev/zero bs=10M count=1 > data
dd if=/dev/urandom bs=1k count=1 >> data
hyperfine --warmup 3 \
'hexyl data' \
'hexyl --no-squeezing data' \
'hexdump -C data' \
'hexdump -C --no-squeezing data' \
'xxd data' \
--export-markdown results.md
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
hexyl data |
1.037 ± 0.023 | 1.014 | 1.078 | 63.1 |
hexyl --no-squeezing data |
1.289 ± 0.022 | 1.261 | 1.319 | 78.4 |
hexdump -C data |
0.016 ± 0.001 | 0.016 | 0.018 | 1.0 |
hexdump -C --no-squeezing data |
1.921 ± 0.014 | 1.902 | 1.943 | 116.8 |
xxd data |
0.707 ± 0.008 | 0.701 | 0.729 | 43.0 |
Apparently, hexdump
s "squeezing" mode is really good.
see #73
Real-world use case -- I've got a pretty big file that's mostly zeroes, with a k or so of nonzero data. Reading from /tmp,
hexyl
takes 55.791s,hexdump -C
takes 1.091s,xxd >/dev/null
takes 40.528s.
Old commit, but here's another use case for posterity. I want to compare two disk images, and I want to not only see where data differs, but also what the differing data is, in a hexdump format. To do that, I like using tools like this to produce a plaintext version of the data that can then be diff
ed. Storing the huge files isn't an issue (either diff can be piped directly, or the huge files can be stored on a compressed filesystem).
I did a bit of benchmarking and I can't help but notice that
xxd
is faster thanhexyl
. On my machine on a file of about 700M:It would be nice to beat
xxd
in speed... I got no idea how to do it though.