Closed vorner closed 1 year ago
I don't necessarily have a problem with doing something that allows one to print an error message on a single line, but I don't think I'm willing to prioritize the log case over the better UX in the common case.
Personally, if it were me, I'd probably just split the error message into lines, drop everything but the last line and log that. I'm not sure the regex crate itself needs to do anything for that. I'm possibly open to other ideas.
Well, the problem with that is that:
Box<dyn Error + Send + Sync>
(or one of the other crates for that). And it might actually be buried somewhere in the chain of error cause
s. So that makes special casing for regex a bit difficult.So I was thinking more in the lines of regex::Error::single_line(self) -> Self
method that would „configure“ the error. It doesn't sound exactly right either, though so I hoped someone could come up with a better option.
Personally, if it were me, I'd probably just split the error message into lines, drop everything but the last line and log that. I'm not sure the regex crate itself needs to do anything for that. I'm possibly open to other ideas.
Well the main issues here is that strategy to propagate errors is not useful for downstream to provide proper error messages that makes sense for the users, especially, when you try to throw them in GUI applications. Default formatting is bad, but it's subjective, but that's not how you can solve/improve that.
The main issue is that the information that Error
is carrying can't be used by downstream other than just reading the actual debug impl and trying to parse everything from that String
. Ideally Syntax
error should send 'location' (offset in regex), 'kind of syntax error' (Information of what syntax error we've reached) and 'original regex string'. With those information, you can actually generate accurate error message by yourself quite easily.
So it should look more like.
pub enum Error {
/// A syntax error.
Syntax(String, usize, String),
/// The compiled program exceeded the set size limit.
/// The argument is the size limit imposed.
CompiledTooBig(usize),
/// Hints that destructuring should not be exhaustive.
///
/// This enum may grow additional variants, so this makes sure clients
/// don't count on exhaustive matching. (Otherwise, adding a new variant
/// could break existing code.)
#[doc(hidden)]
__Nonexhaustive,
}
Where the first element in Syntax
error is a source we were using to build the regex, the second is an offset of a syntax error in the first element, and the third is kind of that error, which could be some enum, like SyntaxErrorKind
, but String
is likely enough. Well, the first element might be redundant, but I guess nothing stops from sending it to.
Note, even ripgrep
could benefit from that, so you can print error:
part in red
, so the actual error message will be more noticeable.
~ ➜ rg "hello("
regex parse error:
hello(
^
error: unclosed group
Oh, in addition, if default formatting should be changed, I guess library should print a bit more boring debug formatting. Right now it looks like that.
Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
hello(
^
error: unclosed group
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)
However something like this makes a bit more sense for a library.
Syntax(source: "hello", offset: 5, kind: "unclosed group")
The main issue is that the information that
Error
is carrying can't be used by downstream other than just reading the actual debug impl and trying to parse everything from thatString
. IdeallySyntax
error should send 'location' (offset in regex), 'kind of syntax error' (Information of what syntax error we've reached) and 'original regex string'. With those information, you can actually generate accurate error message by yourself quite easily.
You can do that today using regex-syntax
. Otherwise, this seems like a separate issue from the one described by the OP. I've already said that changing the default formatting is off the table. The entire point of the multi-line error message is that it is the default. If it wasn't the default, it would be almost pointless because folks aren't going to know to opt into it.
If you really want to make an earnest argument against the default formatting, then please open another issue. Otherwise, I don't see a reason to continue down that path.
So I was thinking more in the lines of
regex::Error::single_line(self) -> Self
method that would „configure“ the error. It doesn't sound exactly right either, though so I hoped someone could come up with a better option.
@vorner This sounds like something I might be open to, but to be honest, it seems pretty ham-fisted to me. If you have a requirement that log messages are all one line, then it seems to me like you need to enforce that requirement on your end.
I'm going to close this for now.
I remain open to the idea of exposing some way of emitting single-line error messages. A mutator on the error type itself kind of seems non-ideal. Another possibility is a global flag or perhaps even an option on RegexBuilder
, which appeals more to me personally. If someone wants to make a more flushed out proposal, I'd be happy to revisit this.
Hello
If there's an error in syntax of regular expression, the
Display
of theError
tries to be helpful and outputs the original pattern with an arrow pointing to the place where it happened.However, I feel like being multiline is not something one would expect from error types. From a more practical perspective, this makes these errors mess up log files, where it is a rule of thumb to never write anything multiline to allow easy grepping through it.
I don't have an idea what exactly to do about the situation ‒ I understand some might like the helpful multiline messages, so removing them completely probably is not the best option, but I'd also like to be able to output errors to logs ‒ and do that in generic way, without knowing what exact error it is, or even have it as one of the error cause.