Switch to Chumsky for parsing

clarkmcc commented 1 year ago

Working on a chumsky parser. Has some benefits:

Easier to read and modify than LALRPOP grammar
Much better error reporting and syntax assistance

hardbyte commented 1 year ago

👀 I've been doing something similar while learning Rust in https://github.com/hardbyte/common-expression-language if you want to collaborate/steal.

clarkmcc commented 1 year ago

@hardbyte Ooo very nice! I hit a wall on the chumsky parser and never got around to it again. I will definitely be doing a bit of stealing, or if you're interested, push a PR!

hardbyte commented 1 year ago

Cool I'm very keen to combine efforts. Keep in mind I'm new at rust though!

First I had a go at getting your branch going - I wired up your chumsky parser to your CLI but evaluating an expression seemed to quickly take up all my ram.

Then I noticed that you were using the pre-release of Chumsky 1.0.0 so I thought a good first step would be porting my parser to the same version so I can hopefully copy some of the implementation across. There were a few api changes between 0.9.2 and 1.0.0-alpha.4 and I'd implemented my parser on char rather than &str.

I've updated my code with the glaring exception that my .collect() calls are no longer compiling... I've pushed what I have to https://github.com/hardbyte/common-expression-language/pull/5 if you felt like taking a look.

Another option might be to use the latest stable version of Chumsky with my parser (adapted for your AST).

clarkmcc commented 1 year ago

That was the wall I hit, I think there's a left fold somewhere in the code for recursive expressions that shouldn't be used. I briefly discussed it with the chumsky author but I didn't get too far.

Yeah that final option seems like a good route to take. How much of the parsing is currently supported right now? I think I noticed at least one to-do item last I checked?

One other thing, were you able to correctly handle order of operations in chumsky? That was one other thing I couldn't get quite right.

hardbyte commented 1 year ago

Hmm hopefully together we can solve it - ~I think I've run into the same issue with an infinity recursive definition now~. I've started porting across to your AST in https://github.com/hardbyte/common-expression-language/pull/6 and think I've gotten over the infinite bug.

Most of the grammar is supported in my parser - haven't done ~in relation~, FieldInits, ~conditionals~ or ternaries. I'm keen to try out the pratt parser that will be in the next official release of chumsky but figured a basic version that follows the grammar would be a good starting point.

clarkmcc commented 1 year ago

@hardbyte fantastic! I want to take a look at this but am a little preoccupied with https://github.com/clarkmcc/cel-rust/pull/11. Will get back to this as soon as I can.

hardbyte commented 1 year ago

Fair enough!

I'll open an issue in this repo to track the next steps as it might not make sense to work from this branch to integrate.

hardbyte commented 1 year ago

Oh, did you intend to disable issues in this repo?

In any case, I've merged in those changes to my main branch and finished everything in the parser except FieldInits. I think the next steps are:

a small refactor PR in this repository to hide larlpop internals and expose a parser function and parser errors from your cel-parser crate?
a bigger PR to replace this one with the work I've done in my repo

How does that sound?

clarkmcc commented 1 year ago

Huh, weird not sure how issues got disabled. Sorry about that! I've re-enabled them. Yes, this makes good sense to me. At some point I'd also like to benchmark LALRPOP vs Chumsky. If the performance differences are notable enough, we might keep both parsers available behind feature flags where the advantage to Chumsky is better parsing error messages. But we can cross that bridge once we have parsing results. I can work up a benchmark for the parser in the mean time.

clarkmcc commented 1 year ago

On a totally unrelated note, I read through your readme and found the reference to the cel-go unit tests. I love the idea of testing against those same tests. I will definitely be looking at that in the near future!

clarkmcc commented 1 year ago

Alright, I've just released the latest project I was working on and now have some free cycles to look at this again. Where are we at on this, and what makes the most sense for me to work on?

hardbyte commented 1 year ago

Awesome, I've written up everything in this issue https://github.com/clarkmcc/cel-rust/issues/14

The short version is I think my chumsky based parser is now slightly ahead feature-wise to the existing lalrpop one, although I'm sure there will be corner cases to hunt down. I'll open a PR to replace this one - adding support for the chumsky parser rather than switching it out entirely. Once I can run the tests I'll post an update with any differences found.

Do you want to take on the benchmarking and/or testing with the cel-go data?

clarkmcc commented 1 year ago

Looking forward to reviewing! Yes, I can definitely take benchmark and tests.

clarkmcc / cel-rust

Switch to Chumsky for parsing #2