lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.91k stars 416 forks source link

Optimize Grammar for FOL with Equality #1215

Closed vivekjoshy closed 1 year ago

vivekjoshy commented 1 year ago

How do I modify this grammar so it matches parenthesis that are further away?

wff: compound_wff
?compound_wff: quantifier | biconditional | atomic_wff
?quantifier: (quantifier_symbol LEFT_CURLY_BRACE variable RIGHT_CURLY_BRACE SPACE?)* bracketted_wff
?biconditional: conditional SPACE (biconditional_symbol SPACE conditional)*
?conditional: disjunction SPACE (conditional_symbol SPACE disjunction)*
?disjunction: conjunction SPACE (disjunction_symbol SPACE conjunction)*
?conjunction: negation SPACE (conjunction_symbol SPACE negation)*
?negation: negation_symbol* SPACE bracketted_wff
?bracketted_wff: LEFT_PARENTHESIS SPACE? compound_wff SPACE? RIGHT_PARENTHESIS SPACE?
?atomic_wff: predicate_name [LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS]                          -> predicate
          | term equal_to term

term: function_name LEFT_PARENTHESIS term (COMMA SPACE? term)* RIGHT_PARENTHESIS                             -> function
    | name
    | variable

SPACE: /\s+/
COMMA: ","
equal_to: "="
LEFT_PARENTHESIS: "("
RIGHT_PARENTHESIS: ")"
LEFT_CURLY_BRACE: "{"
RIGHT_CURLY_BRACE: "}"
quantifier_symbol: universal_quantifier_symbol | existential_quantifier_symbol
universal_quantifier_symbol: "\\forall"
existential_quantifier_symbol: "\\exists"
name: /[a-t]/ | /[a-t]_[1-9]\d*/
variable: /[u-z]/ | /[u-z]_[1-9]\d*/
predicate_name: /[A-HJ-Z]/ | /[A-HJ-Z]_[1-9]\d*/
function_name: /[a-z]/ | /[a-z]_[1-9]\d*/
negation_symbol: "\\neg"
conjunction_symbol: "\\wedge"
disjunction_symbol: "\\vee"
exclusive_disjunction_symbol: "\\veebar"
conditional_symbol: "\\rightarrow" | "\\Rightarrow" | "\\Longrightarrow" | "\\implies"
biconditional_symbol: "\\leftrightarrow" | "\\iff"

I'm trying to parse this using my grammar: $$\exists{x} \forall{y} (P(f(x, y)) \vee \forall{z}(V(z) \iff \neg R(a) \wedge B(a)))$$ Which in LaTeX is:

\exists{x} \forall{y} (P(f(x, y)) \vee \forall{z}(V(z) \iff \neg R(a) \wedge B(a)))

I followed the calculator example and modified my original grammar to add operator precedence which resulted in this. But it's no longer accepting the input string.

I'm getting this error:

lark.exceptions.UnexpectedCharacters: No terminal matches '\' in the current parser context, at line 1 col 35

\exists{x} \forall{y} (P(f(x, y)) \vee \forall{z}(V(z) \iff \neg R(a) \wed
                                  ^
Expected one of: 
    * RIGHT_PARENTHESIS

Ideally I want to force requiring parenthesis wherever possible except in the case of in front of a negated atomic_wff without parenthesis. This is to make sure only one parse tree is produced even on explicit ambiguity setting. How do I resolve this issue?

This is my first time using lark. Any help appreciated!

vivekjoshy commented 1 year ago

Solved here: https://stackoverflow.com/a/74632957/5586359