mozilla / mentat

UNMAINTAINED A persistent, relational store inspired by Datomic and DataScript.
https://mozilla.github.io/mentat/
Apache License 2.0
1.65k stars 115 forks source link

Misleading error parsing Unicode characters in EDN #433

Open bgrins opened 7 years ago

bgrins commented 7 years ago

I'm not sure exactly what the best way to surface this is, but following the error messages led me down a long path. The messages make it appear that the particular character is the problem, when in reality it was a quote that was opened and never closed in a previous operation . For example:

conn.transact(&mut sqlite, r#"[
        {:db/id 17592186049155, :artist/name "W}
        {:db/id 17592186049242, :artist/name "ö"}
]"#).unwrap();

Leads to an error pointing to the ö character, which led me down a path of removing each 'invalid' character, one by one:

thread 'test_big' panicked at 'called `Result::unwrap()` on an `Err` value: Error(EdnParseError(ParseError { line: 3, column: 47, offset: 97, expected: {"[12]", "\t", "nil", "[0-9]", "\r", "[", "0x", "#f", "[*!_?$%&=<>]", "\"", "true", "{", ";", ".", ",", "+", "...", "[2-9]", "}", "(", "\n", "[a-z]", "-", " ", ":", "[A-Z]", "#{", "0", "false", "[3]"} }), State { next_error: None, backtrace: None })', src/libcore/result.rs:859
note: Run with `RUST_BACKTRACE=1` for a backtrace.

If you finally get to the last instance, i.e.

conn.transact(&mut sqlite, r#"[
        {:db/id 17592186049155, :artist/name "W}
        {:db/id 17592186049242, :artist/name "o"}
]"#).unwrap();

Then we end up with an error pointing to the final bracket:

thread 'test_big' panicked at 'called `Result::unwrap()` on an `Err` value: Error(EdnParseError(ParseError { line: 4, column: 6, offset: 106, expected: {"\\tab", "\"", "\\\"", "[^\"]"} }), State { next_error: None, backtrace: None })', src/libcore/result.rs:859
note: Run with `RUST_BACKTRACE=1` for a backtrace.

If there was some kind of context in the error message pointing to the origin of the opened string it would save a ton of time.

ncalexan commented 7 years ago

This is a general problem with our rust-peg EDN grammar, but I don't see a filed ticket. You get around this by annotating the grammar: https://github.com/kevinmehall/rust-peg#error-reporting.

ncalexan commented 7 years ago

And there's another issue here -- I expect we're not parsing Unicode correctly (again at the EDN level).