antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.16k stars 3.7k forks source link

php grammar: Java target crashes on tests when printing trees #3017

Closed kaby76 closed 1 year ago

kaby76 commented 1 year ago

The [php]() grammar crashes when trying to print out trees (trgen -t Java; cd Generated-Java; make; make test).

Exception in thread "main" java.lang.NullPointerException
        at org.antlr.v4.runtime.misc.Utils.escapeWhitespace(Utils.java:63)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:49)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:58)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:58)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:58)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:58)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:58)
        at org.antlr.v4.runtime.tree.Trees.toStringTree(Trees.java:42)
        at org.antlr.v4.runtime.RuleContext.toStringTree(RuleContext.java:189)
        at Test.DoParse(Test.java:206)
        at Test.ParseFilename(Test.java:135)
        at Test.main(Test.java:107)

The code crashes because a token is created with no text, whereas in other targets, text is assigned (Python3, JavaScript). Why the Antlr API allows one to create a CommonToken that is basically inconsistent (null text field, unknow values for start and stop, etc.) is dubious. That, or the toStringTree() API should NOT crash on null pointers!

Note, the CSharp target does not create a new token, but assigns the type without setting text. There's a Python directory, but there is no such thing as a "Python" target--only Python2 or Python3--and the code diverges between Python/ and Python3 ever so slightly, too.

It's a hot steamy mess everywhere you look.

RossPatterson commented 1 year ago

This is ANTLR bug #2677, originally opened on 2019-10-26. I ran into it during my work to improve parse-tree- and errors-file testing, and verified that it was still a bug as of ANTLR4 4.11.1.