zesterer / chumsky

Write expressive, high-performance parsers with ease.
https://crates.io/crates/chumsky
MIT License
3.64k stars 155 forks source link

Labelling has no effect on `one_of()`? #694

Open cAttte opened 4 weeks ago

cAttte commented 4 weeks ago

hey! i was trying to apply some labels on a .one_of(Range) parser but it didn't work, so i decided to try a few other cases:

type E = extra::Err<Rich<'static, char>>;
println!("one_of(range): {:?}", one_of::<_, _, E>('a'..='c').labelled("abc").parse("x"));
println!("one_of(str): {:?}", one_of::<_, _, E>("abc").labelled("abc").parse("x"));
println!("just().or(): {:?}", just::<_, _, E>('a').or(just('b').or(just('c'))).labelled("abc").parse("x"));
println!("choice(just()): {:?}", choice((just::<_, _, E>('a'), just('b'), just('c'))).labelled("abc").parse("x"));
println!("ident(): {:?}", text::unicode::ident::<_, _, E>().labelled("abc").parse("!"));
println!("none_of(): {:?}", none_of::<_, _, E>("def").labelled("abc").parse("e"));
one_of(range): ParseResult { output: None, errs: [found 'x' at 0..1 expected ''a'', ''b'', or ''c''] } // bad
one_of(str): ParseResult { output: None, errs: [found 'x' at 0..1 expected ''a'', ''b'', or ''c''] } // bad
just().or(): ParseResult { output: None, errs: [found 'x' at 0..1 expected "abc"] }
choice(just()): ParseResult { output: None, errs: [found 'x' at 0..1 expected "abc"] }
ident(): ParseResult { output: None, errs: [found '!' at 0..1 expected "abc"] }
none_of(): ParseResult { output: None, errs: [found 'e' at 0..1 expected something else] }

as you can see the label does not show up in cases where one_of() is used. let me know if i'm doing something wrong!

zesterer commented 4 weeks ago

Is this using the latest main? If so, I have an inkling...

cAttte commented 4 weeks ago

hey, nope this is on alpha.7. i've tried main but i get a weird error, i can keep trying if you think it'd help!

zesterer commented 4 weeks ago

What's the error, out of interest?

I think I know what's causing this, but resolving it in the general case is... a little finickey, I think.

cAttte commented 4 weeks ago

ha, i was trying to build with the nightly feature on stable, so that's what the error was.

sounds interesting, if you could explain it a little bit that would be cool, maybe someone can help

zesterer commented 4 weeks ago

So when a parser succeeds, it leaves the input pointing at the next token after whatever it parsed. But when a parser fails, the state it leaves the parser in is unspecified (as in: that's how chumsky behaves right now).

The problem is that labelled depends on the input being left in a specific place, since it determines whether to treat the label as a pattern (i.e: expected foo) or context (i.e: error occurred while parsing foo).

The reason this works with just and not one_of is because the former rewinds the input on failure, while the latter does not. Fixing this is as simple as making the two consistent, but to really fix it chumsky probably needs to take a clearer stance on what state the input should be left in on failure - which also means building that stance into the extension API and so on.

I can give more information tomorrow.