MagLev / maglev

GemStone Maglev Ruby Repository
http://maglev.github.io
517 stars 41 forks source link

Parser doesn't pop cmdarg_stack when () on method call are implicit #133

Open MarkusQ opened 12 years ago

MarkusQ commented 12 years ago

In gammar.y, rParenLexPop is used to pop level off ps->cmdarg_stack, but this does not get called in all the cases it should, allowing levels to accumulate on the stack.

This prevents several gems from compiling, and may be silently causing other issue due to improper parsing.

The simplest example I've come up with is:

p [0]
p [1]
p [2]
p [3]
p [4]
p [5]
p [6]
p [7]
p [8]
p [9]
p [10]
p [11]
p [12]
p [13]
p [14]
p [15]
p [16]
p [17]
p [18]
p [19]
p [20]
p [21]
p [22]
p [23]
p [24]
p [25]
p [26]
p [27]
p [28]
p [29]
p [30]
p [31]
p [32]
p [33]
p [34]
p [35]
p [36]
p [37]
p [38]
p [39]
p [40]
p [41]
p [42]
p [43]
p [44]
p [45]
p [46]
p [47]
p [48]
p [49]
p [50]

which produces the error:

> ruby array_args.rb
array_args.rb:48, cmdarg_stack_overflow
array_args.rb:49, cmdarg_stack_overflow
array_args.rb:49, cmdarg_stack_overflow
array_args.rb:50, cmdarg_stack_overflow
array_args.rb:50, cmdarg_stack_overflow
array_args.rb:51, cmdarg_stack_overflow
array_args.rb:51, cmdarg_stack_overflow
ERROR 2702 , a RubyParseError occurred (error 2702), array_args.rb:48: cmdarg_stack_overflow
unexpected EOF at line 51 (SyntaxError)
topaz 1> exit

I suspect (but this is only a guess) that the correct fix would be to

  open_args       : call_args
                | tLPAREN_ARG  {vps->lex_state = EXPR_ENDARG;} ')'
                    {
                      yTrace(vps, "open_args: tLPAREN_ARG");
                      rParenLexPop(vps);

with something like:

  open_args       : call_args
                    {
                      rParenLexPop(vps);
                    }
                | tLPAREN_ARG  {vps->lex_state = EXPR_ENDARG;} ')'
                    {
                      yTrace(vps, "open_args: tLPAREN_ARG");
                      rParenLexPop(vps);

around line 1850 of grammar.y

MarkusQ commented 12 years ago

Forgot to state that this was tests under maglev 1.0.0 (ruby 1.8.7) (2011-10-31 rev 1.0.0-27184)[Linux x86_64] installed with rvm on Linux 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux and my spelunking in search of the cause was done on master at fd5a1532db2dcb25694962a079d317e7e5607bc5.

zenspider commented 9 years ago

Unfortunately, that fix doesn't work. I think problem is rParenLexPop being called at the parser level vs the pushes being done at the lexer level. In this case, we've got p [].

The sequence SHOULD be something like:

but it is interleaved:

This is because of the way that lalr grammars pull from the lexical stream and then choose the correct grammatical production to match it to. Turning on ytrace + debugCmdArg shows the interleave happening.

I'm going to double check to make sure that this doesn't happen in MRI. It looks like it might, but their code isn't checking for overflow.

zenspider commented 9 years ago

I think my analysis is wrong because of the overly clever code in BitStack using the same word to record both the stack and the depth.

zenspider commented 9 years ago

I still think this is the problem. The stack value is correct, but the depth is not. It needs to be popped AFTER the restore in order to have the correct value. Interleaving is the problem. I don't know how to fix this with the current architecture, other than, possibly, to remove the depth bitfield entirely.