rust-bakery / nom

Rust parser combinator framework
MIT License
9.18k stars 792 forks source link

convert_error() assumes VerboseError saved inputs are substrings of the same overall input #1619

Open douglas-raillard-arm opened 1 year ago

douglas-raillard-arm commented 1 year ago

Prerequisites

Test case

I'm currently using an input based on &[u8] (coming from a binary file) but I want to display the VerboseError as a string since it's actually working on source code snippets. Consequently, I mapped over VerboseError{ error } member to convert the input of each level to a String before handing it to convert_error().

Problem: convert_error() assumes that the saved input in VerboseError { error } is simply a pointer inside the overall input, but in this case it's not at all true (overall input got converted to String, and so is every saved input in the stack). This results in a panic!() as the offset computation results in a negative usize:

   3: <str as nom::traits::Offset>::offset
             at /home/dourai01/.cargo/registry/src/github.com-1ecc6299db9ec823/nom-7.1.2/src/traits.rs:76:5
   4: nom::error::convert_error
             at /home/dourai01/.cargo/registry/src/github.com-1ecc6299db9ec823/nom-7.1.2/src/error.rs:261:18
   5: traceevent::parser::tests::test_parser
             at ./src/parser.rs:286:27
   6: traceevent::cparser::tests::declaration_test::test
             at ./src/cparser.rs:1683:13

Would it be possible to use a content-based offset detection rather than assuming all pointers point inside the same object ? It may be possible to do that as a fallback while the fast path still does pointer arithmetic, since it should be easy to check if the substring ptr is within the address range of the overall input.

douglas-raillard-arm commented 1 year ago

As a workaround, I've used std::str::from_utf8(buf).unwrap() but I was initially hoping to use std::string::String::from_utf8_lossy(buf).to_string()