zesterer / chumsky

Write expressive, high-performance parsers with ease.
https://crates.io/crates/chumsky
MIT License
3.63k stars 155 forks source link

`just(range)` passes the range as output instead of the matched value (1.0.7 alpha) #674

Open 0rvar opened 1 month ago

0rvar commented 1 month ago

Repro:

use chumsky::prelude::*;
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub enum Instr { Name(char) }
fn parser<'a>() -> impl Parser<'a, &'a str, Vec<Instr>, extra::Err<Rich<'a, char>>> {
    let name = just('a'..='z').map(Instr::Name);
    name.padded().repeated().collect()
}

This does not compile due to:

error[E0631]: type mismatch in function arguments
   --> src/lib.rs:9:36
    |
5   |     Name(char),
    |     ---- found signature defined here
...
9   |     let name = just('a'..='z').map(Instr::Name);
    |                                --- ^^^^^^^^^^^ expected due to this
    |                                |
    |                                required by a bound introduced by this call
    |
    = note: expected function signature `fn(RangeInclusive<char>) -> _`
               found function signature `fn(char) -> _`
note: required by a bound in `chumsky::Parser::map`
   --> /Users/orvar/.cargo/registry/src/index.crates.io-6f17d22bba15001f/chumsky-1.0.0-alpha.7/src/lib.rs:519:18
    |
519 |     fn map<U, F: Fn(O) -> U>(self, f: F) -> Map<Self, O, F>
    |                  ^^^^^^^^^^ required by this bound in `Parser::map`
help: consider wrapping the function in a closure
    |
9   |     let name = just('a'..='z').map(|arg0: RangeInclusive<char>| Instr::Name(/* char */));
    |                                    ++++++++++++++++++++++++++++            ++++++++++++
zesterer commented 1 month ago

I now recall why I didn't implement this: just accepts sequences, not just individual characters.

I think this is potentially a good reason to re-deduplicate just, splitting it into just and seq as before. That way, the behaviour of just(0..4) would be to accept any number from 0 to 4 (exclusive) and seq(0..4) would be to accept a sequence of numbers, 0-3 (inclusive).

Re-deduplicating would allow for the behaviour you're looking for!

zesterer commented 1 month ago

I just made an initial attempt at this, but ran into problems: it led to a constraint such that just required the input to implement ValueInput, which isn't always practical, making just difficult to use for token streams and the like. One could feasibly imagine a solution that involves splitting out just into 3 combinators, like so:

Thoughts welcome on this!