Closed andreamazza89 closed 2 years ago
(the library I'm currently using does have an expression parsing module, which probably solves the problem, but it could be fun to try do it myself as a learning exercise
Ended up implementing a Pratt Parser, which was fun, and is working so far.
https://github.com/andreamazza89/jq-exercises/blob/trunk/src/Utils/Parsing.purs#L62
The issue
As I started adding the comma operator (
<exp> , <exp>
), I ran into an issue: operator precedence.My current understanding is that this issue exists when a language has more than one infix operator with different precedence. For example, if we take the expression
because
,
has higher precedence than|
, it should be read like. | (. , .)
rather than(. | .) , .
.The way I have implemented the parser so far makes it unable to deal with this, so I need a solution.
Alternatives
So far, I've found two main avenues to tackle this:
Make the problem go away - parenthesise before parsing
Given that I'll need to support parentheses anyway, I could add a step before parsing in which the source string is parenthesised, which makes the issue go away. I found this page with an algorithm for parenthesising expressions.
Introduce some sort of bottom-up parsing
The Wikipedia page on this topic mentions a few algorithms in this space:
Next steps
All else being equal, I am a big fan of solving problems by making them go away, so I am initially leaning towards the parenthesising solution.
However, I fear there might be drawbacks with this one. Also, it might make for a terrible time debugging/developing with all the extra parentheses.
I am quite interested in the Pratt parser, so: