Herb-AI / HerbGrammar.jl

Grammars for Herb.jl
https://herb-ai.github.io/
MIT License
0 stars 2 forks source link

`expr2rulenode` / `string2rulenode` #78

Open ReubenJ opened 4 months ago

ReubenJ commented 4 months ago

Given an Expr and an AbstractGrammar, I'd like to get the corresponding RuleNode.

For this grammar

grammar_robots = @csgrammar begin
           Start = Sequence                   #1

           Sequence = Operation                #2
           Sequence = (Operation; Sequence)    #3
           Operation = Transformation          #4
           Operation = ControlStatement        #5

           Transformation = moveRight() | moveDown() | moveLeft() | moveUp() | drop() | grab()     #6
           ControlStatement = IF(Condition, Sequence, Sequence)        #12
           ControlStatement = WHILE(Condition, Sequence)               #13

           Condition = atTop() | atBottom() | atLeft() | atRight() | notAtTop() | notAtBottom() | notAtLeft() | notAtRight()      #14
       end

A partial implementation might look like

function expr2rulenode(expr::Expr, grammar::AbstractGrammar)
    if expr.head == :call
        # get the rule index
        rule = findfirst(==(expr), grammar.rules)
        if isnothing(rule)
            error("Rule not found in the grammar")
        end
        # create a rulenode
        rulenode = RuleNode(rule, [])
        return rulenode
    elseif expr.head == :block
        rulenode = RuleNode(3, [expr2rulenode(e, grammar) for e in expr.args])
    else
        error("Only call and block expressions are supported")
    end
end

resulting in the following functionality:

julia> ex = Base.remove_linenums!(:(moveUp(); (moveRight(); grab())))
quote
    moveUp()
    begin
        moveRight()
        grab()
    end
end
julia> expr2rulenode(ex, grammar_robots)
3{9,3{6,11}}

This would be quite useful when testing, for example.

sebdumancic commented 4 months ago

yes, this would be very useful in general! For example, defining a sketch.

I really like that you focused on expression to RuleNode rather than text to RuleNode. We were postponing the addition of this functionality because starting from text is a really tough problem, though we might use some existing libraries for parsing. However, starting from Julia expressions is a much better choice.

ReubenJ commented 4 months ago

How about:

function string2rulenode(str::String, grammar::AbstractGrammar)
    expr = Base.remove_linenums!(Meta.parse(str))
    return expr2rulenode(expr, grammar)
end

works on this single test case at least:

julia> string2rulenode("(moveUp(); (moveRight(); grab()))", grammar_robots)
3{9,3{6,11}}
sebdumancic commented 4 months ago

ag, maybe it is that simple if we rely on Julia parser :)

ReubenJ commented 4 months ago

Which I think is the only parser that makes sense for now, given that we require the rule to be a valid Julia Expr.

janvandermeulen commented 4 months ago

This only seems to support the last child to have children. Something like this throws an error: 4{4{1,3}3}. image

ReubenJ commented 4 months ago

Can you post exactly what you're passing to the function? I think you might be missing a comma.

janvandermeulen commented 4 months ago

Into string2rulenode I am passing: string:

4{4{1,3}3}

and grammar:

1: Number = 1
2: Number = 2
3: Number = x
4: Number = Number + Number
5: Number = Number * Number

This is the copy pasted solution from the getting_started.jl file so it should be correct. image

janvandermeulen commented 4 months ago

This is the stacktrace of the string2rulenode function. It seems to be going wrong in the built-in julia functionality. Especially the Meta.parse(str) functionality.

# Error @ none:1:9
4{4{1,3}3}
#       ╙ ── Expected `}`
Stacktrace:
 [1] #parse#3
   @ ./meta.jl:244 [inlined]
 [2] parse
   @ ./meta.jl:236 [inlined]
 [3] parse(str::String31; filename::String, raise::Bool, depwarn::Bool)
   @ Base.Meta ./meta.jl:278
 [4] parse
   @ ./meta.jl:276 [inlined]
 [5] string2rulenode(str::String31, grammar::ContextSensitiveGrammar)
   @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:18
 [6] create_solutions(grammar::ContextSensitiveGrammar)
   @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:13
 [7] top-level scope
   @ ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:85

This @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:18 is the line:

       expr = Base.remove_linenums!(Meta.parse(str))
ReubenJ commented 4 months ago

Oh, then I misunderstood what you were asking for. This will take the string/expression representation and return the RuleNode, not the string representation of a RuleNode (like 4{4{1,3}3}).

janvandermeulen commented 4 months ago

Ah okay, I also can't seem to get the functionality to work with a function which would be very useful as well. E.g. calling the string2rulenode function with the following input throws an error.

g = @cfgrammar begin
    Number = |(1:2)
    Number = x
    Number = Number + Number 
    Number = Number * Number
end
string2rulenode("2x+1", g)

With the following stacktrace:

ERROR: LoadError: Rule: nothingnot found in the grammar
Stacktrace:
 [1] expr2rulenode(expr::Expr, grammar::ContextSensitiveGrammar)
   @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:28
 [2] string2rulenode(str::String15, grammar::ContextSensitiveGrammar)
   @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:20
 [3] create_solutions(grammar::ContextSensitiveGrammar)
   @ Main ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:13
 [4] top-level scope
   @ ~/Study/IDMP/learning_clingo/benchmark/benchmark.jl:86
in expression starting at /home/jan/Study/IDMP/learning_clingo/benchmark/benchmark.jl:86
ReubenJ commented 4 months ago

I think the correct input, according to your grammar, would be the string "2 * x + 1"

janvandermeulen commented 4 months ago

Ah, that's right. But, even after simplifying the input, it still doesn't seem to work. It still gives the same LoadError.


g = @cfgrammar begin
    Number = |(1:2)
    Number = x
    Number = Number + Number 
    Number = Number * Number
end
string2rulenode("1+1", g)```
ReubenJ commented 4 months ago

Yes, you're right. Like I said, I'd only tested it with some very specific examples. It looks like it only supports the example I tested so far. Feel free to suggest changes, but this isn't something we'll include right this moment, anyway.

janvandermeulen commented 4 months ago

Since we needed it to create a benchmark, I have created this monster method which does it. I can't say it is 100% bug-free, but I tested it on about 20 different instances and those all worked. Feel free to test it out, and let me know if you find a bug.

# Rulenode constructors
# 1: RuleNode(ind::Int, _val::Any) - _val is optional for cache optimisation and this is used for terminal nodes
# 2: RuleNode(ind::Int, children::Vector{AbstractRuleNode})
function string_2_rulenode(ast::String,debug::Bool=false) :: RuleNode
    """
    # Arguments
    - `ast::String`: AST to be converted to RuleNode.
    - `debug::Bool`: Whether to print debug statements.
    # Result 
    - `rulenode::RuleNode`: the AST converted to RuleNode such that it can be used as input to the grammar_optimiser. 
    """
    (num, i) = parse_integer(ast, 1, debug)
    if is_terminal(ast, i)
        return RuleNode(num)
    end
    # If the AST is not a single node, we need to parse the children 
    println("Calling recursive method for children: " * ast[i + 2:length(ast)-1])
    println("With rootnode: " * string(num))
    return RuleNode(num, parse_all_children(ast, 3, length(ast) - 1, debug))
end

function parse_all_children(ast::String, start_index::Int, end_index::Int, debug::Bool) :: Vector{RuleNode}
    """
    Given an index of a rule node, this function parses all it's children and returns them as a vector of RuleNodes.
    # Arguments
    - `ast::String`: AST to be converted to RuleNode.
    - `start_index::Int`: Index of the opening brace.
    - `end_index::Int`: Index of the closing brace.
    - `debug::Bool`: Optional, whether to print debug statements.
    # Result
    - `children::Vector{RuleNode}`: Vector of RuleNodes.
    """
    children::Vector{RuleNode} = []
    i = start_index
    while i <= end_index
        char = ast[i]
        if debug
            println("Debug for iteration i: " * string(i) * " char: " * string(char))
            println("char is digit: " * string(isdigit(char)))
            println("char is terminal: " * string(is_terminal(ast, i)))
        end
        if isdigit(char)
            (num, i) = parse_integer(ast, i, debug)
            if is_terminal(ast, i)
                if debug 
                    println("Adding terminal node: " * string(parse(Int, char)))
                end
                push!(children, RuleNode(num))
            else
                next_brace = find_closing_brace(ast, i + 2)
                if debug
                    println("Non-terminal node: " * string(char))
                    println("Next brace: " * string(next_brace))
                    println("Going into recursive method with ast: " * ast[i + 2:next_brace-1])
                    println("=====RECURSIVE METHOD=====")
                end
                rec_children = parse_all_children(ast, i + 2, next_brace - 1, debug)
                if debug
                    println("Adding children" * string(rec_children) * " To parent: " * string(char))
                    # println("Size of children: " * string(length(rec_children)))
                end
                push!(children, RuleNode(parse(Int, char), rec_children))
                i = next_brace
            end
        end
        i = i + 1
        if debug
            println("======NEXT ITERATION=====")
        end
    end
    if debug 
        println("Found the follwowing children: " * string(children))
        println("===== LEAVING PARSE_ALL_CHILDREN METHOD =====")
    end
    return children
end

function is_terminal(ast::String, index::Int) :: Bool
    """
    Given an AST and the index of the node, this function returns whether the node is a terminal node.
    # Arguments
    - `ast::String`: AST to be converted to RuleNode.
    - `index::Int`: Index of the node.
    # Result
    - `is_terminal::Bool`: Whether the node is a terminal node.
    """
    if index == length(ast)
        return true
    end
    return string(ast[index + 1]) != "{"
end

function find_closing_brace(ast::String, start_index::Int) :: Int
    """
    Given an AST and the index of the opening brace, this function returns the index of the closing brace.
    # Arguments
    - `ast::String`: AST to be converted to RuleNode.
    - `start_index::Int`: Index of the opening brace.
    # Result
    - `index::Int`: Index of the closing brace.
    """
    open_bracket_count::Int = 1
    index = start_index
    while open_bracket_count > 0
        index += 1
        if string(ast[index]) == "{"
            open_bracket_count += 1
        elseif string(ast[index]) == "}"
            open_bracket_count -= 1
        end
    end
    return index
end 

function parse_integer(ast::String, start_index::Int, debug::Bool):: Tuple{Int, Int}
    """
    Given an AST and the start_index of the integer, this function parses the integer and returns the index of the last digit.
    # Arguments
    - `ast::String`: AST to be converted to RuleNode.
    - `start_index::Int`: Index of the opening brace.
    - `debug::Bool`: whether to print debug statements.
    # Result
    - (Int, Int): Tuple of the parsed integer and the index of the last digit.
    """
    i = start_index
    number = ""
    while isdigit(ast[i])
        number = number * string(ast[i])
        i += 1
    end
    if debug
        println("Parsed number: " * number) 
    end
    return (parse(Int, number), i - 1)
end

Some test code:

g = @cfgrammar begin
    Number = |(1:2)
    Number = x
    Number = Number + Number 
    Number = Number * Number
end
tree = string_2_rulenode("5{5{5{2,2}4{1,4{2,2}}}3}", false)
println("Children of the tree are: "* string(tree.children))
println("Tree is: " * string(tree))
println(rulenode2expr(tree, g))
IssaHanou commented 2 weeks ago

Relates to the RuleNode programs in the old psb2 branch

sebdumancic commented 5 days ago

some options to build upon:

ReubenJ commented 4 days ago

+1 for ParserCombinator. The lack of updates on the project is a little worrisome, and documentation isn't the best, but I just used it to build a little parser for AEON files for the Biodivine boolean network benchmark.