kaby76 / Antlr4BuildTasks

Third-party build tool for 'Official' Antlr4 tool and runtime parsers using .Net. Drop-in replacement for 'Antlr4cs' Antlr4 tool and build rules.
MIT License
75 stars 10 forks source link

Stack overflow in error listener #2

Open kaby76 opened 4 years ago

kaby76 commented 4 years ago

I've been putting this off, but there is a bug in the error reporter code I wrote for the Antlr program template. I'm checking in the code to reproduce it.

kaby76 commented 4 years ago

In order to compute the lookahead sets that a parse expects, the error listener makes a call back to the Antlr runtime to get the lookahead. If the token is invalid, round and round she goes.

pl1.dll!pl1.ErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int col, string msg, Antlr4.Runtime.RecognitionException e) Line 37
    at C:\Users\kenne\Documents\AntlrExamples\pl1\ErrorListener.cs(37)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.ProxyErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, Antlr4.Runtime.RecognitionException e) Line 43
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\ProxyErrorListener.cs(43)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NotifyListeners(Antlr4.Runtime.LexerNoViableAltException e) Line 561
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(561)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NextToken() Line 178
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(178)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Fetch(int n) Line 240
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(240)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Sync(int i) Line 220
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(220)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.CommonTokenStream.LT(int k) Line 145
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\CommonTokenStream.cs(145)
pl1.dll!pl1.LASets.Compute(Antlr4.Runtime.Parser parser, Antlr4.Runtime.CommonTokenStream token_stream, int line, int col) Line 61
    at C:\Users\kenne\Documents\AntlrExamples\pl1\LASets.cs(61)
pl1.dll!pl1.ErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int col, string msg, Antlr4.Runtime.RecognitionException e) Line 38
    at C:\Users\kenne\Documents\AntlrExamples\pl1\ErrorListener.cs(38)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.ProxyErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, Antlr4.Runtime.RecognitionException e) Line 43
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\ProxyErrorListener.cs(43)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NotifyListeners(Antlr4.Runtime.LexerNoViableAltException e) Line 561
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(561)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NextToken() Line 178
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(178)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Fetch(int n) Line 240
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(240)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Sync(int i) Line 220
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(220)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.CommonTokenStream.LT(int k) Line 145
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\CommonTokenStream.cs(145)
pl1.dll!pl1.LASets.Compute(Antlr4.Runtime.Parser parser, Antlr4.Runtime.CommonTokenStream token_stream, int line, int col) Line 61
    at C:\Users\kenne\Documents\AntlrExamples\pl1\LASets.cs(61)
pl1.dll!pl1.ErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int col, string msg, Antlr4.Runtime.RecognitionException e) Line 38
    at C:\Users\kenne\Documents\AntlrExamples\pl1\ErrorListener.cs(38)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.ProxyErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, Antlr4.Runtime.RecognitionException e) Line 43
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\ProxyErrorListener.cs(43)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NotifyListeners(Antlr4.Runtime.LexerNoViableAltException e) Line 561
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(561)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NextToken() Line 178
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(178)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Fetch(int n) Line 240
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(240)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Sync(int i) Line 220
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(220)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.CommonTokenStream.LT(int k) Line 145
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\CommonTokenStream.cs(145)
pl1.dll!pl1.LASets.Compute(Antlr4.Runtime.Parser parser, Antlr4.Runtime.CommonTokenStream token_stream, int line, int col) Line 61
    at C:\Users\kenne\Documents\AntlrExamples\pl1\LASets.cs(61)
pl1.dll!pl1.ErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int col, string msg, Antlr4.Runtime.RecognitionException e) Line 38
    at C:\Users\kenne\Documents\AntlrExamples\pl1\ErrorListener.cs(38)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.ProxyErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, Antlr4.Runtime.RecognitionException e) Line 43
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\ProxyErrorListener.cs(43)

...
pl1.dll!pl1.LASets.Compute(Antlr4.Runtime.Parser parser, Antlr4.Runtime.CommonTokenStream token_stream, int line, int col) Line 61
    at C:\Users\kenne\Documents\AntlrExamples\pl1\LASets.cs(61)
pl1.dll!pl1.ErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int col, string msg, Antlr4.Runtime.RecognitionException e) Line 38
    at C:\Users\kenne\Documents\AntlrExamples\pl1\ErrorListener.cs(38)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.ProxyErrorListener<int>.SyntaxError(System.IO.TextWriter output, Antlr4.Runtime.IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, Antlr4.Runtime.RecognitionException e) Line 43
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\ProxyErrorListener.cs(43)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NotifyListeners(Antlr4.Runtime.LexerNoViableAltException e) Line 561
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(561)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Lexer.NextToken() Line 178
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Lexer.cs(178)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Fetch(int n) Line 240
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(240)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Sync(int i) Line 220
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(220)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.BufferedTokenStream.Consume() Line 191
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\BufferedTokenStream.cs(191)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Parser.Consume() Line 734
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Parser.cs(734)
Antlr4.Runtime.Standard.dll!Antlr4.Runtime.Parser.Match(int ttype) Line 245
    at C:\Users\kenne\Documents\antlr4\runtime\CSharp\runtime\CSharp\Antlr4.Runtime\Parser.cs(245)
pl1.dll!pl1_parserParser.pl1stmtlist(int _p) Line 749
    at C:\Users\kenne\Documents\AntlrExamples\pl1\obj\Debug\netcoreapp3.1\pl1_parserParser.cs(749)
pl1.dll!pl1_parserParser.pl1pgm() Line 661
    at C:\Users\kenne\Documents\AntlrExamples\pl1\obj\Debug\netcoreapp3.1\pl1_parserParser.cs(661)
pl1.dll!pl1.Program.Main(string[] args) Line 48
    at C:\Users\kenne\Documents\AntlrExamples\pl1\Program.cs(48)
kaby76 commented 4 years ago

Correcting this problem is easy. But, there is another problem. It looks like I'm not getting any information of possible parses for an input with error (e.g., add after "PRTHDG1: PROCEDURE; asdf PUT FILE ...".

kaby76 commented 4 years ago

For sure, something is wrong with the interpreter because it faults very early on with no viable paths through the ATN. xx.txt x.txt

kaby76 commented 4 years ago

The problem is clearly the code in EnterState() to test whether this state needs to be revisited.

        if (_visited.ContainsKey(new Pair<ATNState, int>(state, token_index)))
        {
            if (_log_parse)
            {
                System.Console.Error.WriteLine(
                    new String(' ', indent * 2)
                    + "already visited.");
            }
            return null;
        }

The code notices it's visiting state 284 (varnamequal) with input PSAM1. The first time it visited this state, it was in the context of a varnameref. The second time it is in the context of prestmtlist. Since it entered this state with the same input index, it assumes that this is just a loop within the ATN. But, it's not.

The problem with this code is that there is a different context between the time state was first visited and the second. State and token_index are insufficient. It should be the entire path to this point.

I'll hold off on a solution until I thought about this a little more, after a good hike tomorrow (Welch-Dickey).

kaby76 commented 4 years ago

I'm still getting this problem, e.g., asn.g in an Antlr3 parse, line 960.