Closed sharifhsn closed 1 year ago
I have a pull request ready for terminal width, but it requires adding the terminal_width
dependency. I'll wait to submit it until this request is approved.
Thank you very much. Works great!
Does this have any impact on performance? (see e.g. #66 (comment) on how to perform
hexyl
benchmarks usinghyperfine
)
No, I'll do that now. I couldn't find a benchmark harness before, thank you for pointing me to it.
We should probably add a script with a few useful benchmarks, true. For this PR (or in general), I would suggest to use something like:
#!/bin/bash
dd if=/dev/urandom bs=1M count=1 of=/tmp/data
hyperfine \
--warmup 3 \
--shell=none \
--output=pipe \
--export-markdown=result-pipe.md \
'hexyl /tmp/data'
hyperfine \
--warmup 3 \
--shell=none \
--output=inherit \
--export-markdown=result-inherit.md \
'hexyl /tmp/data'
rm /tmp/data
Here, we use hyperfine
s new --output
option to specify where to redirect the output of the benchmarked program. Using "pipe" is like doing hexyl … | cat > /dev/null
(which is much different from doing hexyl … > /dev/null
because hexyl
doesn't directly write to /dev/null
and kernel optimizations don't apply, i.e. write
calls actually need to take place). And using "inherit" is like calling the program with the TTY attached (this will output a lot to your screen).
You can also create two binaries, hexyl-master
and hexyl-feature
, and then run something like:
hyperfine \
--warmup 3 \
--shell=none \
--output=pipe \
--parameter-list version master,feature
--export-markdown=result-pipe.md \
'./hexyl-{version} /tmp/data'
to see the direct comparison.
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
hexyl data |
421.3 ± 14.3 | 404.5 | 444.0 | 1.24 ± 0.11 |
hexyl --no-squeezing data |
563.5 ± 14.7 | 542.3 | 583.4 | 1.65 ± 0.15 |
target/release/hexyl data |
433.0 ± 16.0 | 409.1 | 462.5 | 1.27 ± 0.12 |
target/release/hexyl --no-squeezing data |
526.2 ± 18.1 | 506.6 | 559.6 | 1.55 ± 0.14 |
target/release/hexyl --columns 4 data |
367.4 ± 12.7 | 347.8 | 382.6 | 1.08 ± 0.10 |
target/release/hexyl --columns 4 --no-squeezing data |
407.6 ± 14.2 | 375.5 | 426.5 | 1.20 ± 0.11 |
target/release/hexyl --columns 8 data |
340.5 ± 28.6 | 306.6 | 399.2 | 1.00 |
target/release/hexyl --columns 8 --no-squeezing data |
351.9 ± 15.3 | 318.9 | 366.2 | 1.03 ± 0.10 |
These are my results using 1 MB of /dev/urandom
. I used my system-installed binary of hexyl
to compare since I'm pretty sure there were no feature changes since the last release.
It seems as though for the default width of two columns the new version is slightly slower, but within noise for the benchmark. With more columns it's much faster, but that of course makes sense since it's writing fewer lines.
One more thing; I've implemented columns
here as a u8
, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type to u16
?
I've added the terminal width setting, so this should completely close #13.
One more thing; I've implemented
columns
here as au8
, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type tou16
?
I wanted to remark this in my first review but didn't want to be overly annoying :smile:. In my personal experience: trying to use size-limited (unsigned) integers (u8, u16, i8, i16) in places which are not size- or performance critical is almost always a mistake. It's usually meant as a (premature) optimization. But the reality is that it (1) leads to more errors due to overflows (2) leads to more complicated code due to more explicit casts between 8-bit, 16-bit and 32-bit numbers and (3) doesn't even necessarily generate faster code (because if they are not closely packed, we can't really make use of them) or smaller data structs (due to alignment mismatches).
So personally, for a single parameter, I would never mind trying to go for a small integer type and simply go with a 32-bit integer, or even isize
/usize
. Often, it's even undesirable to use unsigned integers for parameters that cannot (reasonably) be negative, because again, it might lead to more complicated code, and it can lead to tricky overflow errors.
Feel free to bump the min. supported Rust version if that causes problems.
Concerning the newly added options:
Would it be possible to scratch --auto-width
and use something like --columns=max
or --columns=auto
instead?
Also, I think --terminal-width
should be usable with other "width-setting" options. I think of it purely as an overwrite for the auto-detected terminal width. What do you think?
I've added the terminal width setting, so this should completely close #13.
Does it? I believe the issue also asks for custom widths that would not be multiples of 16. Like if you have some data structure with elements that are 14 byte long. Then you might want to specify --width=14
.
One more thing; I've implemented
columns
here as au8
, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type tou16
?I wanted to remark this in my first review but didn't want to be overly annoying smile. In my personal experience: trying to use size-limited (unsigned) integers (u8, u16, i8, i16) in places which are not size- or performance critical is almost always a mistake. It's usually meant as a (premature) optimization. But the reality is that it (1) leads to more errors due to overflows (2) leads to more complicated code due to more explicit casts between 8-bit, 16-bit and 32-bit numbers and (3) doesn't even necessarily generate faster code (because if they are not closely packed, we can't really make use of them) or smaller data structs (due to alignment mismatches).
So personally, for a single parameter, I would never mind trying to go for a small integer type and simply go with a 32-bit integer, or even
isize
/usize
. Often, it's even undesirable to use unsigned integers for parameters that cannot (reasonably) be negative, because again, it might lead to more complicated code, and it can lead to tricky overflow errors.
Thank you for saying this! I am quite new to Rust so I appreciate the advice. Making it a u16
eliminates a cast (since terminal_width
uses a u16
) so I'll use that instead. Using a bigger value would be more of a hassle than a u16
.
Concerning the newly added options:
Would it be possible to scratch
--auto-width
and use something like--columns=max
or--columns=auto
instead?
Of course.
Also, I think
--terminal-width
should be usable with other "width-setting" options. I think of it purely as an overwrite for the auto-detected terminal width. What do you think?
I was only going off of the original issue's suggestions. The use case for --terminal-width
is probably in a situation where the width is necessarily limited for some reason. In that case, I would pick the minimum between the two options i.e. prefer --columns
unless the number of columns exceeds that which is dictated by --terminal-width
.
I've added the terminal width setting, so this should completely close #13.
Does it? I believe the issue also asks for custom widths that would not be multiples of 16. Like if you have some data structure with elements that are 14 byte long. Then you might want to specify
--width=14
.
Ah, I wasn't aware that was in the scope of the original issue. That would probably require a more significant rewrite.
Ah, I wasn't aware that was in the scope of the original issue. That would probably require a more significant rewrite.
That's okay. Let's focus on --columns
for now.
Ok, I think I misunderstood the initial issue. The custom number of columns refers to the individual columns, whereas my pull request focuses on customizing the number of panels. With that in mind, I think I should rename columns
to panels
and reflect that in the documentation.
A future change that customizes columns per panel would therefore be able to slot into this change better.
Another option would be to leave the option name as is, but change the meaning. And then for this PR, we only allow --columns=8,16,24,…
. What do you think?
Another option would be to leave the option name as is, but change the meaning. And then for this PR, we only allow
--columns=8,16,24,…
. What do you think?
I think it would be more ergonomic here to have the two options, columns-per-panel and number-of-panels.
In my mind, the idea behind variable number columns is to have the panel size itself adjust. With the other option, the panel size would always be 8 bytes, and a future ability to have e.g. columns=14
would leave blank space at the end.
It would be better to set columns=7
and panels=2
which is more explicit in my opinion.
You are right.
Let's change the command line option name then 👍🏼
@sharkdp I think we can declare this PR complete? I don't see any other roadblocks.
I have a commit ready for a builder pattern. It uses a new PrinterBuilder struct, so it can technically be implemented in a non-breaking way, although that's probably undesirable. It's quite out of scope for this pull request, so I'll submit a new one once this one is approved.
That would be great! If we do that, I would make that a breaking change, i.e. remove the old new
method.
@sharkdp I think we can declare this PR complete? I don't see any other roadblocks.
Yes. Looks great. Two things: the inline comment regarding the short options. And it would be great if you could add a CHANGELOG entry in the "unreleased" section. The format for entries is:
- Description what has been changed, see #123 (@user)
where #123
links to the bug ticket and/or the PR and user
is your username.
Thank you!
This pull request fixes #13 partially by adding the
--columns
option to manually set the number of columns. The default is 2 as with previous behavior.Some notes:
-w
not-c
as that option is already reserved for--bytes
cargo test