Open stefnotch opened 11 months ago
I did try map_with_span
, but that gives me a PrattOpOutput
. I'm not sure if I can do anything with it.
On that note, why is type InfixBuilder<E> = fn(lhs: E, rhs: E) -> E;
a fn
? Most other chumsky APIs accept a Fn
, which makes some tasks much easier.
Edit: I see, it's because getting a Fn
to work in that case is tricky https://github.com/zesterer/chumsky/pull/464#issuecomment-1612910421
Would it be possible to extend the InfixBuilder to also have access to the infix token parser result? Same for prefix and postfix parsers. Then it would at least be possible to add some info to the token parser, and pass it into the infix builder.
This is my attempt.
https://github.com/zesterer/chumsky/compare/main...stefnotch:pratt-with-op?expand=1
I hope it's roughly the right direction, but one thing that's missing is the ability to write build
functions that take an operator that has a different type. For example, here the build function takes a char
as the operator, and two Expr
s as the children. Supporting this probably requires passing another generic type through all the functions.
I’ll give it a look today to see what I can do to get it to give you the op spanned. A quick and dirty way to do it would be to use the span’s of the expressions to calculate the span of the operator, as the operator’s span is the space between the two expr’s (though whitespace makes that not 100% true)
@Zij-IT
I think there's a certain value in having access to the entire parsed operator. For example, when parsing sufficiently complex mathematical expressions, then there is Knuth's up arrow notation. https://en.m.wikipedia.org/wiki/Knuth%27s_up-arrow_notation Which has an operator with an exponent. That would be most easily parsed by writing a "up arrow followed by exponent" parser for the operator.
@Zij-IT
I think there's a certain value in having access to the entire parsed operator. For example, when parsing sufficiently complex mathematical expressions, then there is Knuth's up arrow notation. https://en.m.wikipedia.org/wiki/Knuth%27s_up-arrow_notation Which has an operator with an exponent. That would be most easily parsed by writing a "up arrow followed by exponent" parser for the operator.
Excellent point. I’ll give it a shot tonight and see what I can do!
So, here is some API questions that I have. How does this look?
let parser = atom
.pratt(choice((
left_infix(just('-'), 0, |lhs, op, rhs| Expr::Sub(Box::new(lhs), Box::new(rhs))),
left_infix(just('+'), 0, |lhs, op, rhs| Expr::Add(Box::new(lhs), Box::new(rhs))),
)))
.with_postfix_ops(
// For postfix the `op` is on the right
postfix(just('!'), 0, |lhs, op| Expr::Factorial(Box::new(lhs))),
)
.with_prefix_ops(
// For prefix the `op` is on the left
prefix(just('-'), 0, |op, rhs| Expr::Negate(Box::new(rhs))),
)
.map(|x| x.to_string());
Or do we follow str::split_once
and str::rsplit_once
and keep the op
placement consistent so that both use fn(Op, Expr) -> Expr
That looks like a good API. I'd keep the |lhs, op|
design, if the infix operator's callback is |lhs, op, rhs|
.
If it's |op, lhs|
, then I'd expect the infix operator's callback to also look roughly like that. That's actually what I did above, mostly because I wanted to have an easy time implementing it. There I have an API that looks like left_infix(just('-'), 0, |op, [lhs, rhs]| Expr::Sub(Box::new(lhs), Box::new(rhs)))
. That made using the current Mode API very easy.
https://github.com/zesterer/chumsky/compare/main...stefnotch:pratt-with-op?expand=1#diff-350f258948b06701ad68bdfb8528d5fdc35c29e5d6a4bdbd0c7673eea00a7e3fR365-R366
Did you have any luck with a proper implementation so far, or are some parts trickier than expected? Is there anything I could do to help?
In my little experiments with extending the Pratt parsing implementation, I never managed to figure out what exactly Expr
in impl<'a, P, Expr, I, O, E> ParserSealed<'a, I, PrattOpOutput<PrefixBuilder<Expr>>, E>
was used for, which sadly limited my understanding of that bit of code.
When using the new pratt parsing feature, how would I create a syntax tree which includes spans?
e.g. How does one get spans that include the operators? And how does one get the spans of the operators themselves? https://github.com/zesterer/chumsky/blob/01b96cd643e4b47b36ee76baa57294d2153965f7/src/pratt.rs#L709-L719