lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.75k stars 401 forks source link

2 different function definitions for a new language #1290

Closed zdanl closed 1 year ago

zdanl commented 1 year ago

Hello,

I've consulted with ChatGPT GPT4 before asking here, but it can not correct my LARK code accurately. Consider these rules for supporting the following language test for subroutines/functions:

_FUNC_DECL: "sub"
function_sig_1: _FUNC_DECL NAME "(" [parameters] ")" [":" type]
function_sig_2: type _FUNC_DECL NAME "(" [parameters] ")"
function_def: function_sig_1 body | function_sig_2 body`

And this is the Unit-Test:

sub main() : i64 {
  return 0;
}

i64 sub main() {
   return 0;
}

Could someone let me know what I am doing wrong? The LARK exception is:

lark.exceptions.VisitError: Error trying to process rule "function_def":

cannot unpack non-iterable Tree object

Thx.

MegaIng commented 1 year ago

This is an error in your Transformer/Visitor, which you haven't shown.

zdanl commented 1 year ago

This is an error in your Transformer/Visitor, which you haven't shown.

Thx. This should be the relevant part. Apologies, I haven't written the Transformer, and was assuming type to be abstracted properly, no matter the position / multiple LARK rules.

    def function_sig(self, node):
        fname, fparams, fret = node
        return (fname, fparams, fret)

    def function_def(self, node):
        sig, body = node
        target, arglist, _type = sig

        return self.init_node(
            FunctionDef,
            target,
            type=_type,
            arglist=arglist,
            target=target,
            body=body
        )

    def parameters(self, nodes):
        return nodes

    def parameter(self, node):
        return node

    def type(self, node):
        return node[0]
erezsh commented 1 year ago

This is again not the whole transformer. But I'm guessing it is defined with @v_args(tree=True), in which case the nodes arrive as Tree instances. If you remove it, it should work (but who knows)

zdanl commented 1 year ago

This is again not the whole transformer. But I'm guessing it is defined with @v_args(tree=True), in which case the nodes arrive as Tree instances. If you remove it, it should work (but who knows)

Sorry for wasting your time.

import sys
from lark import Transformer, Tree, Token
from lyra.ast.nodes import *

transformer_module = sys.modules[__name__]

class LyraTransformer(Transformer):

    def module(self, node):
        return node

    def struct_def(self, node):
        target, *members = node
        return self.init_node(
            StructDef,
            target,
            target=target,
            members=members
        )

    def struct_member(self, node):
        return node

    def function_sig(self, node):
        fname, fparams, fret = node
        return (fname, fparams, fret)

    def function_def(self, node):
        sig, body = node
        target, arglist, _type = sig

        return self.init_node(
            FunctionDef,
            target,
            type=_type,
            arglist=arglist,
            target=target,
            body=body
        )

    def parameters(self, nodes):
        return nodes

    def parameter(self, node):
        return node

    def type(self, node):
        return node[0]

    def if_stmt(self, node):
        else_block = None
        node.reverse()

        for block in node:
            if len(block if block else []) == 2:
                # means it has a condition i.e if or elif block
                cond, body = block
                else_block = self.init_node(
                    IfBlock, cond, expr=cond, body=body, orelse=else_block
                )
            elif block:
                else_block = block[0]

        return else_block

    def while_stmt(self, node):
        cond, body = node
        return self.init_node(WhileBlock, cond, expr=cond, body=body)

    def until_stmt(self, node):
        cond, body = node
        return self.init_node(UntilBlock, cond, expr=cond, body=body)

    def continue_stmt(self, node):
        return self.init_node(ContinueStatement, node[0])

    def break_stmt(self, node):
        return self.init_node(BreakStatement, node[0])

    def return_stmt(self, node):
        return self.init_node(ReturnStatement, node[0], expr=node[1])

    def pass_stmt(self, node):
        return self.init_node(PassStatement, node[0])

    def default_exec(self, node):
        return node

    def cond_exec(self, node):
        return node

    def variable(self, node):
        return node

    def declaration(self, node):
        (target, _type), value = node
        return self.init_node(
            Declaration,
            target,
            target=target,
            type=_type,
            value=value
        )

    def body(self, node):
        return list(filter(None, node))

    def stmt(self, node):
        return node[0]

    def call(self, node):
        target, arglist = node
        return self.init_node(
            Call,
            target,
            target=target,
            arglist=arglist
        )

    def arguments(self, node):
        return node

    def kwarg(self, node):
        return node

    def get_item(self, node):
        target, index = node
        return self.init_node(
            GetItem,
            node=target,
            target=target,
            index=index
        )

    def get_attr(self, node):
        target, attr = node
        return self.init_node(
            GetAttribute,
            target,
            target=target,
            attr=attr
        )
    def assign(self, node):
        target, val = node
        return self.init_node(
                Assign,
                target,
                target=target,
                value=val
        )

    def expr(self, node):
        return node[0]

    def get_var(self, node):
        return node[0]

    def setup_list(self, node):
        return node[0]

    def list(self, node):
        node = node or []

        if len(node) < 1:
            return

        return self.init_node(List, node[0], arglist=node)

    def number(self, node):
        n = node[0]
        return self.init_node(Number, n, value=n.value, type=n.type)

    def string(self, node):
        n = node[0]
        return self.init_node(String, n, value=n.value, type=n.type)

    def NAME(self, node):
        return self.init_node(Name, node, value=node.value)

    # ------------------------------------------
    #    Boolean operators
    # ------------------------------------------

    def _or(self, node):
        op = "||"
        lhs, rhs = node
        return self.init_node(BooleanOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def _and(self, node):
        op = "&&"
        lhs, rhs = node
        return self.init_node(BooleanOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def _not(self, node):
        op = "!"
        lhs = node
        return self.init_node(BooleanOp, lhs, op=op, lhs=lhs, rhs=None)

    def lt(self, node):
        op = "<"
        lhs, rhs = node
        return self.init_node(CompareOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def gt(self, node):
        op = ">"
        lhs, rhs = node
        return self.init_node(CompareOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def eq(self, node):
        op = "=="
        lhs, rhs = node
        return self.init_node(CompareOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def ne(self, node):
        op = "!="
        lhs, rhs = node
        return self.init_node(CompareOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def _in(self, node):
        op = "in"
        lhs, rhs = node
        return self.init_node(CompareOp, lhs, op=op, lhs=lhs, rhs=rhs)

    # ------------------------------------------
    #    Binary operators
    # ------------------------------------------

    def add(self, node):
        op = "+"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def sub(self, node):
        op = "-"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def mul(self, node):
        op = "*"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def div(self, node):
        op = "/"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def mod(self, node):
        op = "%"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def pow(self, node):
        op = "**"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    # ------------------------------------------
    #    Bitwise operators
    # ------------------------------------------

    def bitor(self, node):
        op = "|"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def bitxor(self, node):
        op = "^"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def bitand(self, node):
        op = "&"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def shr(self, node):
        op = ">>"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    def shl(self, node):
        op = "<<"
        if not node:
            return op
        lhs, rhs = node
        return self.init_node(BinaryOp, lhs, op=op, lhs=lhs, rhs=rhs)

    # ------------------------------------------
    #    Unary operators
    # ------------------------------------------

    def uadd(self, node):
        op = "+"
        lhs = node[0]
        return self.init_node(UnaryOp, lhs, op=op, lhs=lhs, rhs=None)

    def usub(self, node):
        op = "-"
        lhs = node[0]
        return self.init_node(UnaryOp, lhs, op=op, lhs=lhs, rhs=None)

    def invert(self, node):
        op = "~"
        lhs = node[0]
        return self.init_node(UnaryOp, lhs, op=op, lhs=lhs, rhs=None)

    def ptr(self, node):
        op = "*"
        lhs = node[0]
        return self.init_node(UnaryOp, lhs, op=op, lhs=lhs, rhs=None)

    def deref(self, node):
        op = "&"
        lhs = node[0]
        return self.init_node(UnaryOp, lhs, op=op, lhs=lhs, rhs=None)

    def init_node(self, node_type, node, **kwargs):
        kwargs.update(dict(
                line=node.line,
                column=node.column,
                end_column=node.end_column,
        ))
        return node_type(**kwargs)
erezsh commented 1 year ago

Okay, here's the problem. function_def assumes that function_sig has already ran and transformed the trees. However, that doesn't happen, because the rule has been renamed to function_sig_1/2.

Either fix it in the transformer (function_sig_1 = function_sig etc.) or in the grammar, by doing something like function_sig: _function_sig_1| _function_sig_2. (notice the _ at the beginning of the rule, to make them inline)

zdanl commented 1 year ago

Thanks for pointing out LARK specific syntax _.

I did

_FUNC_DECL: "sub"
function_sig_1: _FUNC_DECL NAME "(" [parameters] ")" ":" type
function_sig_2: type _FUNC_DECL NAME "(" [parameters] ")"
function_sig: _function_sig_1| _function_sig_2
function_def: function_sig body | _FUNC_DECL NAME "(" [parameters] ")" body

But it's telling me _function_sig_2 is not defined.

Any idea?

Happy to send you some Ethereum for your time.

erezsh commented 1 year ago

The rule definitions also need to start with _, it's part of the name.

erezsh commented 1 year ago

Hmm actually since it seems that the sig1 and sig2 rules have different signatures, looks like you'll have to fix it in the transformer. Sorry for misleading you! I don't think there's an easy way to avoid changing the transformer.