cenotelie / hime

Apache License 2.0
27 stars 4 forks source link

WIP: Add SPPF Visitor #96

Open stevefan1999-personal opened 11 months ago

stevefan1999-personal commented 11 months ago

This PR implements part of the idea expressed here: https://github.com/cenotelie/hime/issues/88#issuecomment-1761391207

I have another obscure idea I can't express clearly (I'm not sure if that's novel or not) that if we want to implement semantic checker to multiple "trees"/"branches" of the SPPF at the same time, at the very least, the data structure state holding the information about the current AST walker must be immutable -- which means data is cloned between each forest transition. Then we will need to use persistent data structure/Copy-On-Write mechanism/transactional memory to manage the state data. While I'm pretty sure functional programming people may have enjoyed writing parsers this way because their way to handle data structures are immutable by design, so this is trivial for all kinds of program, but this idea came back to the time when I was still using rust-peg (kevinmehall/rust-peg#301)

TODO:

  1. Figure out how to emit the divergent visitors on different versions
    1. This can be done using a vector, but I think iterator would be okay too
  2. Figure out how to do tree pruning
    1. In case if semantic checker determined this tree is invalid, so that we can do an early exit for this tree anyway (feels like branch and bound/backtrack to me)
stevefan1999-personal commented 11 months ago

Seems like there is a bit of a trouble on the cloning part. I'm not sure why the visitors are not following...

ASTWalker { test: Vector { root: Leaf(["function_definition"]), bits: 5, length: 1 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement"]), bits: 5, length: 2 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list"]), bits: 5, length: 3 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list"]), bits: 5, length: 4 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list"]), bits: 5, length: 5 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "declaration"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "expression_statement"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "expression_statement", "expression"]), bits: 5, length: 9 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "declaration"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "expression_statement"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "expression_statement", "expression"]), bits: 5, length: 9 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "jump_statement"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "block_item_list", "block_item_list", "block_item_list", "block_item", "statement", "jump_statement", "expression"]), bits: 5, length: 9 } }
#[derive(Clone, Debug)]
struct ASTWalker {
    test: VectorSync<&'static str>
}

impl c::AstVisitor for ASTWalker {
    fn on_variable_declaration(&mut self, node: &AstNode) {
        self.test = self.test.push_back("declaration");
        println!("{:?}", self);
    }

    fn on_variable_expression(&mut self, node: &AstNode) {
        self.test = self.test.push_back("expression");
        println!("{:?}", self);
    }

    fn on_variable_block_item(&mut self, node: &AstNode) {
        self.test = self.test.push_back("block_item");
        println!("{:?}", self);
    }

    fn on_variable_compound_statement(&mut self, node: &AstNode) {
        self.test = self.test.push_back("compound_statement");
        println!("{:?}", self);
    }

    fn on_variable_block_item_list(&mut self, node: &AstNode) {
        self.test = self.test.push_back("block_item_list");
        println!("{:?}", self);
    }

    fn on_variable_function_definition(&mut self, node: &AstNode) {
        self.test = self.test.push_back("function_definition");
        println!("{:?}", self);
    }

    fn on_variable_statement(&mut self, node: &AstNode) {
        self.test = self.test.push_back("statement");
        println!("{:?}", self);
    }

    fn on_variable_expression_statement(&mut self, node: &AstNode) {
        self.test = self.test.push_back("expression_statement");
        println!("{:?}", self);
    }

    fn on_variable_jump_statement(&mut self, _node: &AstNode) {
        self.test = self.test.push_back("jump_statement");
        println!("{:?}", self);
    }

}

impl c::SppfVisitor for ASTWalker {
    fn on_variable_declaration(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("declaration");
        println!("{:?}", self);
    }

    fn on_variable_expression(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("expression");
        println!("{:?}", self);
    }

    fn on_variable_block_item(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("block_item");
        println!("{:?}", self);
    }

    fn on_variable_compound_statement(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("compound_statement");
        println!("{:?}", self);
    }

    fn on_variable_block_item_list(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("block_item_list");
        println!("{:?}", self);
    }

    fn on_variable_function_definition(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("function_definition");
        println!("{:?}", self);
    }

    fn on_variable_statement(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("statement");
        println!("{:?}", self);
    }

    fn on_variable_expression_statement(&mut self, node: &SppfNodeVersion) {
        self.test = self.test.push_back("expression_statement");
        println!("{:?}", self);
    }

    fn on_variable_jump_statement(&mut self, _node: &SppfNodeVersion) {
        self.test = self.test.push_back("jump_statement");
        println!("{:?}", self);
    }

}

fn main() {
    let result = c::parse_str_to_sppf(
        r#"

int main() {
    T * a;
    U * b;
    return a * b;
}
    "#);

    let mut walker: Box<dyn SppfVisitor> = Box::new(ASTWalker { test: Vector::new_sync() });
    c::visit_sppf(&result, &mut walker);
}

I'm using the C grammar here: https://github.com/cenotelie/hime/issues/88#issue-1911081492

I'm also using rpds for VectorSync (a persistent data structure that supports O(1) clone)

stevefan1999-personal commented 11 months ago

Seems like the backlink to the next AST node on divergent version was not present.

I also find it curious that I can get an expected result using the optimized grammar here: https://github.com/cenotelie/hime/issues/88#issuecomment-1759716003

ASTWalker { test: Vector { root: Leaf(["function_definition"]), bits: 5, length: 1 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement"]), bits: 5, length: 2 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration"]), bits: 5, length: 3 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "declaration"]), bits: 5, length: 4 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "declaration", "statement"]), bits: 5, length: 5 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "declaration", "statement", "jump_statement"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "declaration", "statement", "jump_statement", "expression"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement"]), bits: 5, length: 2 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement"]), bits: 5, length: 3 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement"]), bits: 5, length: 4 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression"]), bits: 5, length: 5 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "declaration"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "declaration", "statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "declaration", "statement", "jump_statement"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "declaration", "statement", "jump_statement", "expression"]), bits: 5, length: 9 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement"]), bits: 5, length: 2 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration"]), bits: 5, length: 3 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement"]), bits: 5, length: 4 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement", "expression_statement"]), bits: 5, length: 5 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement", "expression_statement", "expression"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement", "expression_statement", "expression", "statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement", "expression_statement", "expression", "statement", "jump_statement"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "declaration", "statement", "expression_statement", "expression", "statement", "jump_statement", "expression"]), bits: 5, length: 9 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement"]), bits: 5, length: 2 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement"]), bits: 5, length: 3 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement"]), bits: 5, length: 4 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression"]), bits: 5, length: 5 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement"]), bits: 5, length: 6 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement", "expression_statement"]), bits: 5, length: 7 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement", "expression_statement", "expression"]), bits: 5, length: 8 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement", "expression_statement", "expression", "statement"]), bits: 5, length: 9 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement", "expression_statement", "expression", "statement", "jump_statement"]), bits: 5, length: 10 } }
ASTWalker { test: Vector { root: Leaf(["function_definition", "compound_statement", "statement", "expression_statement", "expression", "statement", "expression_statement", "expression", "statement", "jump_statement", "expression"]), bits: 5, length: 11 } }