pest-parser / pest

The Elegant Parser
https://pest.rs
Apache License 2.0
4.63k stars 260 forks source link

How to avoid `left-recursive` in pest-script? #423

Closed UkonnRa closed 2 years ago

UkonnRa commented 5 years ago

Say a piece of code:

#[derive(Debug)]
pub enum Expr {
    Number(i32),
    Op(Box<Expr>, Opcode, Box<Expr>),
}

#[derive(Debug)]
pub enum Opcode {
    And, Or
}

let code = "(123,234);(234,(345,(1,2),23)),789";

I want to parse code as the result:

Op(Op(Op(Number(123), And, Number(234)), Or, Op(Number(234), And, Op(Op(Number(345), And, Op(Number(1), And, Number(2))), And, Number(23)))), And, Number(789))

I can use lalrpop-script like this:

use std::str::FromStr;
use crate::{Expr, Opcode};

grammar;

pub Expr: Box<Expr> = {
    Expr ExprOp Term => Box::new(Expr::Op(<>)),
    Term,
};

Term: Box<Expr> = {
    Num => Box::new(Expr::Number(<>)),
    "(" <Expr> ")"
};

Num: i32 = {
    r"[0-9]+" => i32::from_str(<>).unwrap()
};

ExprOp: Opcode = {
    ";" => Opcode::Or,
    "," => Opcode::And,
};

But the same code written in pest-script:

expr = { (expr ~ operator ~ term) | term }
term = { num | ("(" ~ expr ~ ")") }
operator = { "," | ";" }
num = { ASCII_DIGIT+ }

Will throw an error:

rule expr is left-recursive (expr -> expr); pest::prec_climber might be useful in this case

I have found the all-hand-written calculator in pest/test, so must I written like that if I want to deal with the left-recursive problem? Any better idea?

xacrimon commented 5 years ago

That's a core problem with the parser Pest generates with can only parse LL grammars. A left recursive parser should be Earley or GLR/GLL.

dragostis commented 5 years ago

@xacrimon, that is not very accurate. pest uses parsing expression grammars.

xacrimon commented 5 years ago

Ah, my fault. I am quite new to parsers.