kyclark / command-line-rust

Code for Command-Line Rust (O'Reilly, 2024, ISBN 9781098109417)
https://learning.oreilly.com/library/view/command-line-rust/9781098109424/
MIT License
1.55k stars 252 forks source link

Test failure output for headr is unfortunate #3

Open carols10cents opened 3 years ago

carols10cents commented 3 years ago

Putting this as an issue on this repo rather than commenting on the book because this problem isn't actually anything to do with the book text; it only exists in this repo.

I realize that because headr in chapter 3 is demonstrating how splitting UTF-8 codepoints when requesting a number of bytes results in invalid UTF-8, you can't match on Rust strings and you have to use predicate::eq(&expected.as_bytes() as &[u8]). Buuut it makes trying to figure out why a test is failing rather frustrating :(

For example:

---- multiple_files_c1 stdout ----
thread 'multiple_files_c1' panicked at 'Unexpected stdout, failed var == [61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 101, 109, 112, 116, 121, 46, 116, 120, 116, 32, 60, 61, 61, 10, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 111, 110, 101, 46, 116, 120, 116, 32, 60, 61, 61, 10, 239, 191, 189, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 116, 119, 111, 46, 116, 120, 116, 32, 60, 61, 61, 10, 84, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 116, 104, 114, 101, 101, 46, 116, 120, 116, 32, 60, 61, 61, 10, 84]

command=`"/Users/carolnichols/rust/hands-on/carols-solutions/headr2/target/debug/headr" "./tests/inputs/empty.txt" "./tests/inputs/one.txt" "./tests/inputs/two.txt" "./tests/inputs/three.txt" "-c" "1"`
code=0
stdout=```"Öne line, four words.\nTwo lines.\nFour words.\nThree\r\nlines,\r\nfour words.\n"```
stderr=```""```
', /Users/carolnichols/.cargo/registry/src/github.com-1ecc6299db9ec823/assert_cmd-1.0.8/src/assert.rs:124:9

Or a different one where I am printing the invalid UTF-8 correctly (but messed up the newlines, but it's hard to tell that from this message):

---- multiple_files_c1 stdout ----
thread 'multiple_files_c1' panicked at 'Unexpected stdout, failed var == [61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 101, 109, 112, 116, 121, 46, 116, 120, 116, 32, 60, 61, 61, 10, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 111, 110, 101, 46, 116, 120, 116, 32, 60, 61, 61, 10, 239, 191, 189, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 116, 119, 111, 46, 116, 120, 116, 32, 60, 61, 61, 10, 84, 10, 61, 61, 62, 32, 46, 47, 116, 101, 115, 116, 115, 47, 105, 110, 112, 117, 116, 115, 47, 116, 104, 114, 101, 101, 46, 116, 120, 116, 32, 60, 61, 61, 10, 84]

command=`"/Users/carolnichols/rust/hands-on/carols-solutions/headr2/target/debug/headr" "./tests/inputs/empty.txt" "./tests/inputs/one.txt" "./tests/inputs/two.txt" "./tests/inputs/three.txt" "-c" "1"`
code=0
stdout=```"==> ./tests/inputs/empty.txt <==\n\n\n==> ./tests/inputs/one.txt <==\n\n�\n==> ./tests/inputs/two.txt <==\n\nT\n==> ./tests/inputs/three.txt <==\n\nT"```
stderr=```""```
', /Users/carolnichols/.cargo/registry/src/github.com-1ecc6299db9ec823/assert_cmd-1.0.8/src/assert.rs:124:9

The problem is that it prints the expected bytes but prints the actual stdout as string.

I wanted to send this as a PR rather than an issue, but I haven't found a great solution and wanted to brainstorm some ideas with you...

What do you think?

ysy commented 1 year ago

Would it be more appropriate to compare by bytes?

let expected = String::from_utf8_lossy(&buffer);

will convert the single byte read from expected/*.txt to 3-bytes UTF8 string.

As I understand, head -c1 should only read one byte from input file, without doing any UTF8 conversion, am i wrong? or it will do some special charset handing if it's stdin is connected to terminal?

The iimplemention use file.lines() to get data, which convert bytes into utf-8 String, it's better to use read(buf: &[u8]) to read bytes directly from file when -c is specified if I understand correctly.

kyclark commented 9 months ago

I think I finally have a good solution in using "pretty_assertions::assert_eq" instead of assert_cmd's "stdout" comparison. It's a tiny bit more work, but the output is much more helpful. I'm using it on the "clap_v4*" branches.