rubberduck-vba / Rubberduck

Every programmer needs a rubberduck. COM add-in for the VBA & VB6 IDE (VBE).
https://rubberduckvba.com
GNU General Public License v3.0
1.91k stars 299 forks source link

Bang notation appears to be broken in the SLL parse #4405

Open comintern opened 6 years ago

comintern commented 6 years ago

I decided to track down some of the output spam from the parser:

[10/2/2018 8:56:39 PM Warning] line 4:9 extraneous input 'SomeIdentifier' expecting {'=', WS, LINE_CONTINUATION}
[10/2/2018 8:56:39 PM Warning] line 4:9 extraneous input 'SomeIdentifier' expecting {'=', WS, LINE_CONTINUATION}

It turns out to be this code in the MemberNotOnInterface_DoesNotReturnResult_BangNotation test:

Sub Foo()
    Dim dict As Dictionary
    Set dict = New Dictionary
    dict!SomeIdentifier = 42
End Sub

Parsing that code in the VBE with logging results in this:

2018-10-03 18:46:05.5888;WARN-2.2.6849.38216;Rubberduck.Parsing.VBA.Parsing.TokenStreamParserBase;SLL mode failed while parsing the CodePaneCode version of module Sheet1 at symbol SomeIdentifier at L4C10. Retrying using LL.;
2018-10-03 18:46:05.5888;DEBUG-2.2.6849.38216;Rubberduck.Parsing.VBA.Parsing.TokenStreamParserBase;Rubberduck.Parsing.Symbols.ParsingExceptions.MainParseSyntaxErrorException: extraneous input 'SomeIdentifier' expecting {'=', WS, LINE_CONTINUATION}
   at Rubberduck.Parsing.Symbols.ParsingExceptions.MainParseExceptionErrorListener.SyntaxError(IRecognizer recognizer, IToken offendingSymbol, Int32 line, Int32 charPositionInLine, String msg, RecognitionException e) in C:\Rubberduck\Rubberduck.Parsing\Symbols\ParsingExceptions\MainParseExceptionErrorListener.cs:line 17
   at Antlr4.Runtime.ProxyErrorListener`1.SyntaxError(IRecognizer recognizer, Symbol offendingSymbol, Int32 line, Int32 charPositionInLine, String msg, RecognitionException e)
   at Antlr4.Runtime.Parser.NotifyErrorListeners(IToken offendingToken, String msg, RecognitionException e)
   at Antlr4.Runtime.DefaultErrorStrategy.SingleTokenDeletion(Parser recognizer)
   at Antlr4.Runtime.DefaultErrorStrategy.Sync(Parser recognizer)
   at Rubberduck.Parsing.Grammar.VBAParser.letStmt() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 10147
   at Rubberduck.Parsing.Grammar.VBAParser.mainBlockStmt() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 3053
   at Rubberduck.Parsing.Grammar.VBAParser.blockStmt() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 2779
   at Rubberduck.Parsing.Grammar.VBAParser.block() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 2620
   at Rubberduck.Parsing.Grammar.VBAParser.subStmt() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 13210
   at Rubberduck.Parsing.Grammar.VBAParser.moduleBodyElement() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 2556
   at Rubberduck.Parsing.Grammar.VBAParser.moduleBody() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 2449
   at Rubberduck.Parsing.Grammar.VBAParser.module() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 446
   at Rubberduck.Parsing.Grammar.VBAParser.startRule() in C:\Rubberduck\Rubberduck.Parsing\obj\Debug\VBAParser.cs:line 334
   at Rubberduck.Parsing.VBA.Parsing.VBATokenStreamParser.Parse(ITokenStream tokenStream, PredictionMode predictionMode, IParserErrorListener errorListener) in C:\Rubberduck\Rubberduck.Parsing\VBA\Parsing\VBATokenStreamParser.cs:line 21
   at Rubberduck.Parsing.VBA.Parsing.TokenStreamParserBase.ParseSll(String moduleName, ITokenStream tokenStream, CodeKind codeKind) in C:\Rubberduck\Rubberduck.Parsing\VBA\Parsing\TokenStreamParserBase.cs:line 86
   at Rubberduck.Parsing.VBA.Parsing.TokenStreamParserBase.ParseWithFallBack(String moduleName, CommonTokenStream tokenStream, CodeKind codeKind) in C:\Rubberduck\Rubberduck.Parsing\VBA\Parsing\TokenStreamParserBase.cs:line 47
Token: SomeIdentifier at L4C10
Kind of parsed code: CodePaneCode
Component: Sheet1 (code pane version)
MDoerner commented 6 years ago

Thanks for actually opening an issue for this long known problem. The SLL parser really does not like the bang notation. (The LL parser seems to work just fine.)

Unfortunately, getting it to work in the SLL parser is rather complicated, in particular, since it is part of the left-recursion lExpression rule.

I think what happens is that it tries to recursively parse the bang notation, which means that it has to parse the start as aother lExpression. Unfortunately, the SLL parser is not context aware. So it matches the the start including the exclamation mark as an identifier, which leaves it no options one rule invocation up.

What concerns me here is why we get console output in the tests if the SLL parse fails.

MDoerner commented 4 years ago

Since the linked PR, this should only be an issue for bang notation on a foreign identifier.