Marwes / combine

A parser combinator library for Rust
https://docs.rs/combine/*/combine/
MIT License
1.29k stars 93 forks source link

Errors include unprintable or awkwardly printed characters. #333

Open epage opened 2 years ago

epage commented 2 years ago

cargo has a test that tries null characters: https://github.com/rust-lang/cargo/pull/10086/files#diff-4980c43ff583070e4cb20d00368e0c15d57a0c0127d7ddc5676e1ee12b899d18R927

it also exposes that a newline is shown for errors: https://github.com/ordian/toml_edit/issues/259

epage commented 2 years ago

It'd be nice if combine did a look up an showed a word or an escape sequence instead of rendering these characters.

I'd be willing to implement this, just want the go ahead first.

epage commented 2 years ago

I assume we'd either do a debug representation of the string or only do this if the string is 1 character long.

Marwes commented 2 years ago

Just using https://doc.rust-lang.org/std/primitive.str.html#method.escape_default and the same method of char might be good when rendering?

epage commented 2 years ago

I think the main question is handling of unicode, whether those should be escaped or not. I think most users can see and understand a unicode character but the escaped form is a bit more mystifying (at least for this ASCII-only person)

It can be a good start though

epage commented 2 years ago

My proposal is we'd escape

I'd personally not escape \ since we are creating something for humans and I think humans would be able to understand the message.

The challenge with any of this is the impl Display for Info is for Info<T: Display, R: Display>. For non byte / char, we could end up munging people's tokens (granted, 99% of custom tokens will probably be unescaped). Though this is making me wonder how we are doing pretty printing of byte tokens which toml_edit is using.

Marwes commented 2 years ago

That list seems fine to me, it can always be tweaked later.