portante / pycscope

Cscope database generator for Python source code
GNU General Public License v2.0
141 stars 29 forks source link

in_trailer and spb_lvl protocol broken by intermingling of brackets #8

Closed fspeech closed 12 years ago

fspeech commented 12 years ago

The new broken pattern is now this:

x.get([], "123", c=[1,2]).text = "ok"

However this issue could affect both LHS and RHS. It just does not cause assertion errors on the RHS.

It looks like now isTrailerFuncCall is correctly setting in_trailer. However it then got turned off prematurely by the RSQB terminal (of the first argument). Further comma in the argument list then caused the error. Since the argumentlist can be arbitrarily complex this protocol of turning on/off in_trailer won't work without more context? Basically in_trailer is set to the spb_lvl of the entering bracket and reset when a leaving bracket is at the same level. This can't handle the intermingling of brackets.

Using these printing codes inside walkCst should make the problem easy to see:

            print(ctx.spb_lvl)
            print(ctx.in_trailer)
            print("%5d%s%s" % (lineno, " " * indent, nodeNames[cst[0]]))

The effect on RHS is probably currently non-existent. However is it possible that once in_trailer can be correctly set some RHS symbols can be recognized better? I do notice that sometimes cscope doesn't recognize what appear to me valid symbols but I haven't really documented these irregular encounters.

portante commented 12 years ago

Yes, assignment processing has to be fixed in a different way, as this tracking mechanism is inept.

And if you do find a case where symbols are not recognized, please open another issue so that I can fix that.

fspeech commented 12 years ago

For now this can be fixed by making in_trailer a tuple of (matching_bracket, bracketing_level). When exiting trailer check both matching bracket and spb_lvl:

diff --git a/pycscope.py b/pycscope.py
index a3df94a..687aa8d 100755
--- a/pycscope.py
+++ b/pycscope.py
@@ -785,6 +785,6 @@ def processNonTerminal(ctx, cst):
             # Suspend COMMA processing in trailers
-            ctx.in_trailer = ctx.spb_lvl[token.LPAR]
+            ctx.in_trailer = (token.RPAR, ctx.spb_lvl[token.LPAR])
         elif isNamedTrailerArrayRef(cst, l):
             # Suspend COMMA processing in trailers
-            ctx.in_trailer = ctx.spb_lvl[token.LSQB]
+            ctx.in_trailer = (token.RSQB, ctx.spb_lvl[token.LSQB])
         for i in range(1, l - 1):
@@ -795,3 +795,3 @@ def processNonTerminal(ctx, cst):
                 # Suspend COMMA processing in trailers
-                ctx.in_trailer = ctx.spb_lvl[token.LPAR]
+                ctx.in_trailer = (token.RPAR, ctx.spb_lvl[token.LPAR])

@@ -928,3 +928,3 @@ def processTerminal(ctx, cst):
         ctx.spb_lvl[lmap[cst[0]]] -= 1
-        if ctx.spb_lvl[lmap[cst[0]]] == ctx.in_trailer:
+        if (cst[0], ctx.spb_lvl[lmap[cst[0]]]) == ctx.in_trailer:
             # We have left the trailer

This appears to have fixed this test case.

BTW, if spbl_lvl is indexed on the matching right bracket then lmap can go away and the code will be a bit simpler to read.

portante commented 12 years ago

It would seem this hides another problem. The above test case causes the problem because the first symbol encountered, x, is not marked as an assignment symbol before the in_trailer state is setup. Once in_trailer is setup, the assignment marking is suppressed, such that when we later encounter a set of parentheses, brackets or braces, we incorrectly reset the state before processing the comma.

Your patch handles this case, but the first symbol is not marked for assignment.

I am working on what I think is a better way to handle this without the need to keep track of so much state with lots of exceptions.

fspeech commented 12 years ago

Sounds good! What just occurred to me is that fixing the recognition of isTrailerFuncCall was a major step forward as that has a real effect on symbol recognition w.r.t. Mark.FUNC_CALL. That should solve some of my chance encounters with the inability to jump to symbols.

portante commented 12 years ago

Okay, as of commit 0c1f633 this should be working correctly. Please re-open if you find it does not work correctly.