vsbenas / parser-gen

A parser generator in Lua using PEG syntax.
MIT License
49 stars 9 forks source link

The AST generated by lua-parser seems to have some issue with operators associativity and priority #2

Closed andremm closed 7 years ago

andremm commented 7 years ago

I was testing lua-parser.lua example and found the following issue with the generated AST: while evaluating the expression 2^3+1 I got 16 as the result when in fact it should be 9.

I used the following dummy code to reproduce this behavior:

local m = require 'lua-parser'

local t = m.parse("return 2^3+1")

local function eval (exp)
  if exp.rule == "exp" then
    if exp[1] then
      local exp1 = eval(exp[1])
      if exp[2] then
         local exp2 = eval(exp[2][2])
         if exp[2][1][1] == '+' then
           return exp1 + exp2
         elseif exp[2][1][1] == '*' then
           return exp1 * exp2
         elseif exp[2][1][1] == '^' then
           return exp1 ^ exp2
         end
      else
        return exp1
      end
    end
  elseif exp.rule == "expTokens" then
    return eval(exp[1])
  elseif exp.rule == "number" then
    return eval(exp[1])
  elseif exp.rule == "INT" then
    return tonumber(exp[1])
  end
end

-- throws away the return statement and only evaluates its expression
print(eval(t[1][1][2][1]))

Am I missing something in the AST or is it a bug?

vsbenas commented 7 years ago

You can print the generated AST with

local peg = require "peg-parser"
peg.print_t(t)
-- the AST for your example is:
rule='chunk',
{
         rule='block',
         {
                  rule='retstat',
                  'return',
                  {
                           rule='explist',
                           {
                                    rule='exp',
                                    {
                                             rule='expTokens',
                                             {
                                                      rule='number',
                                                      {
                                                               rule='INT',
                                                               '2',
                                                      },
                                             },
                                    },
                                    {
                                             rule='expOps',
                                             {
                                                      rule='operatorPower',
                                                      '^',
                                             },
                                             {
                                                      rule='exp',
                                                      {
                                                               rule='expTokens',
                                                               {
                                                                        rule='number',
                                                                        {
                                                                                 rule='INT',
                                                                                 '3',
                                                                        },
                                                               },
                                                      },
                                                      {
                                                               rule='expOps',
                                                               {
                                                                        rule='operatorAddSub',
                                                                        '+',
                                                               },
                                                               {
                                                                        rule='exp',
                                                                        {
                                                                                 rule='expTokens',
                                                                                 {
                                                                                          rule='number',
                                                                                          {
                                                                                                   rule='INT',
                                                                                                   '1',
                                                                                          },
                                                                                 },
                                                                        },
                                                               },
                                                      },
                                             },
                                    },
                           },
                  },
         },
},

Yes, you are right, the operator prioritization is not good. Also, I think some operators should associate to the right (like the power operator), but the lua-parser associates everything to the left. I will try to fix this when I have some time, if you manage to do it please make a pull request.

vsbenas commented 7 years ago

Okay, I have fixed the issue. I had to add new rules so you might need to tweak your eval code. Now the AST is more bulky, but the prioritization should be fixed, also associations can now be done by the user. The new AST for your code:

rule='chunk',
{
  rule='block',
  {
    rule='retstat',
    'return',
    {
      rule='explist',
      {
        rule='exp',
        {
          rule='expOR',
          {
            rule='expAND',
            {
              rule='expREL',
              {
                rule='expBIT',
                {
                  rule='expCAT',
                  {
                    rule='expADD',
                    {
                      rule='expMUL',
                      {
                        rule='expUNA',
                        {
                          rule='expPOW',
                          {
                            rule='expTokens',
                            {
                              rule='number',
                              {
                                rule='INT',
                                '2',
                              },
                            },
                          },
                          {
                            rule='operatorPower',
                            '^',
                          },
                          {
                            rule='expUNA',
                            {
                              rule='expPOW',
                              {
                                rule='expTokens',
                                {
                                  rule='number',
                                  {
                                    rule='INT',
                                    '3',
                                  },
                                },
                              },
                            },
                          },
                        },
                      },
                    },
                    {
                      rule='operatorAddSub',
                      '+',
                    },
                    {
                      rule='expMUL',
                      {
                        rule='expUNA',
                        {
                          rule='expPOW',
                          {
                            rule='expTokens',
                            {
                              rule='number',
                              {
                                rule='INT',
                                '1',
                              },
                            },
                          },
                        },
                      },
                    },
                  },
                },
              },
            },
          },
        },
      },
    },
  },
},

Actually I'm working on reducing the AST size, should be finished today.

andremm commented 7 years ago

Okay, I have fixed the issue. I had to add new rules so you might need to tweak your eval code.

Cool! It looks like it is working. Here is my dummy eval after your fix:

local m = require 'lua-parser'

local t = m.parse("return 2^3+1")

local function eval (exp)
  if string.find(exp.rule, "^exp") then
    local r = eval(exp[1])
    for i=2,#exp,2 do
      if exp[i][1] == '+' then
        r = r + eval(exp[i+1])
      elseif exp[i][1] == '*' then
        r = r * eval(exp[i+1])
      elseif exp[i][1] == '^' then
        r = r ^ eval(exp[i+1])
      end
    end
    return r
  elseif exp.rule == "expTokens" then
    return eval(exp[1])
  elseif exp.rule == "number" then
    return eval(exp[1])
  elseif exp.rule == "INT" then
    return tonumber(exp[1])
  end
end

-- throws away the return statement and only evaluates its expression
print(eval(t[1][1][2][1]))

Actually I'm working on reducing the AST size, should be finished today.

I believe that you can fold the expression nodes while you generate the AST, and while folding maybe you can reduce the size of the expression nodes inside the generated AST.