haoxiang47 / ply

Automatically exported from code.google.com/p/ply
0 stars 0 forks source link

PLY 2.5 fails to shift a token #12

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. python cxx.py bug.cpp
2. python cxx.py -v bug.cpp
3.

What is the expected output? What do you see instead?
1. No output
2. Show the PLY trace, and terminate normally.

What version of the product are you using? On what operating system?

PLY 2.5, under Cygwin.

Please provide any additional information below.

I submitted a note about this to the ply-hack discussion, and I have
subsequently refined the C++ parser here.  I've tested the parser on a
variety of C++ test examples.  At this point, the remaining parsing issue
appears to lie with the table that PLY generates to parse nested ids, like
A::B::C.

In the bug.cpp example, the parser generates an error in state 124, when
processing a SCOPE token (which represents '::').  But state 124 is
summarized as:

state 124

    (5) id_scope -> id . SCOPE
    (8) nested_id -> id .

    SCOPE           shift and go to state 401

So, this state should shift the SCOPE token, but it doesn't.

Perhaps a clue to the issue with PLY is that there are two other
'equivalent' states in the table:

state 280

    (8) nested_id -> id .
    (5) id_scope -> id . SCOPE

state 641

    (5) id_scope -> id . SCOPE

Perhaps PLY is getting confused...???

Original issue reported on code.google.com by whart222 on 16 Sep 2008 at 7:00

Attachments:

GoogleCodeExporter commented 8 years ago
Given that this grammar has 15 shift/reduce conflicts, this problem is very 
likely a problem with the grammar 
and not a problem with PLY.     It's not going to be easy to track down, but 
your task at this point is to 
investigate those shift reduce conflicts and to rewrite parts of the grammar to 
try and eliminate them.

Note: I've written a C++ header parser for the SWIG project and happen to know 
that it's notoriously difficult 
to do so.   That grammar has still has 5 shift/reduce conflicts. 

The only way I'd classify this as a PLY bug is if the *exact same* grammar used 
here works with another 
LALR(1) parser generator (yacc, bison, etc.).

Original comment by dbeaz...@gmail.com on 16 Sep 2008 at 10:56

GoogleCodeExporter commented 8 years ago
I've finished fixing the bugs in my parser.  As Dave noted, the shift-reduce 
issues
were the source of the bug in my grammar.

However, the reason I submitted this issue was not because my grammar was 
failing,
but because I was having difficulty tracking down the issue.  My issue with the 
state
machine generated by Ply is that it can have states that look 'good', but which
generate parse errors.

Perhaps this is a well-known issue with LALR parsers, but I was unsuccessful 
googling
for dicussions of this issue.  Thus, this may be a 'usability' issue for Ply???

Original comment by whart222 on 22 Sep 2008 at 5:44

GoogleCodeExporter commented 8 years ago
The difficulty of tracking down parsing problems, especially shift-reduce 
conflicts, is a well-known problem with 
LALR(1) parsers and is a "feature" PLY shares with yacc, bison, and other LALR 
parser generators.   PLY provides 
the same diagnostic information as those tools and parser.out contains all of 
the information that went into 
constructing the state tables.   So, I'm not exactly sure what I would add in 
addition to that.

Of course, people are usually willing to put up with LALR magic because the 
resulting parsers usually run really 
fast have other nice properties.

Original comment by dbeaz...@gmail.com on 22 Sep 2008 at 11:20