seanyoung / lrpeg

Left Recursive PEG for rust
MIT License
67 stars 3 forks source link

Idea: Support `node_tag` and `branch_tag` #5

Closed oovm closed 2 years ago

oovm commented 2 years ago

Add two fields to identify nodes

#[derive(Clone, Debug)]
pub struct Node<'i> {
    pub rule: Rule,
    pub start: usize,
    pub end: usize,
    pub node_tag: Option<&'i str>,   // The lifetime is the same as the input text
    pub branch_tag: Option<&'i str>, // The lifetime is the same as the input text
    pub children: Vec<Node<'i>>,
    pub alternative: Option<u16>,
}

Basically like this:

expr <-
    "(" expr ")"            #Priority
  / lhs=expr "*" rhs=expr   #Mul
  / lhs=expr "/" rhs=expr   #Div
  / lhs=expr "+" rhs=expr   #Add
  / lhs=expr "-" rhs=expr   #Sub
  / num                     #Atom
  ;
num <- re#[0-9]+#;

PEG.js has this feature

seanyoung commented 2 years ago

This is a great idea, thank you!

So I get that you wish to label a node. What does the branch_tag for?

I think the life-time can be 'static because we can simply use const static str for this.

oovm commented 2 years ago

I have an actual usage example here

Consider the grammar

epxr <- 
     "(" expr ")"     #Priority
    / expr "<-" expr  #Mark
    // ...others

I marked the branch_tag here

https://github.com/ygg-lang/yggdrasil-rs/blob/82cfeb8db1c96d42d4d006e7d19ca010f77942c8/projects/ygg-bootstrap/src/cst/parse.rs#L187-L198

A macro is used here, and it looks like this after expansion:

#[inline]
pub fn expr(s: RuleState) -> RuleResult {
    let s = match s.rule(Rule::BRANCH, self::__aux_expr_priority) {
        Ok(o) => return o.tag_branch("Priority"),
        Err(e) => e,
    };
    let s = match s.rule(Rule::BRANCH, self::__aux_expr_mark) {
        Ok(o) => return o.tag_branch("Mark"),
        Err(e) => e,
    };
    /// ...others 
    return Err(s);
}

Finally deal with branch_tag here

https://github.com/ygg-lang/yggdrasil-rs/blob/82cfeb8db1c96d42d4d006e7d19ca010f77942c8/projects/ygg-bootstrap/src/ast/parse/mod.rs#L62-L91

oovm commented 2 years ago

You are right, it should be Option<&'static str>

seanyoung commented 2 years ago

:+1: Right, that is pretty nice.

I was thinking the start/end in the node could be replaced with a `&str' (with the same lifetime as the input string). I suspect this used the same amount of memory, but with a much better devx.

seanyoung commented 2 years ago

I've given this some thought. I think using special comments is not a great way of doing this. Also, we want to be able to mark expressions as "create nodes for this". With that in mind I've taken some inspiration from lalrpop and I've come up with this synax:

start <- (<foo> / bar) EOI;

foo <-
     add:/ <left:foo> "+" <right:num>
     sub:/ <left:foo> "-" <right:num>
     num;

bar <- "NO";

num <- re#\d+#;

So the idea is:

seanyoung commented 2 years ago

I've pushed the changes which add the labels for nodes < .. > and alternatives label :/.

At the moment, all nodes are still being created. That the next step.