antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.12k stars 3.28k forks source link

Reference to Parser from ParseTree #1838

Open RedTailedHawk opened 7 years ago

RedTailedHawk commented 7 years ago

Would it make sense to add a reference to the Parser object from the ParseTree interface? I find myself having to pass the parser object into some of my visitors because it's not immediately available when visiting a parse tree.

I seem to recall that it was available in the Python version of ANTLR; i.e. I was able to do ctx.parser.

Thanks.

RedTailedHawk commented 7 years ago

Looks like I can get the full text (with whitespace) without having to use the parser. I was previously doing this:

String expression = parser.getTokenStream().getText(ctx.getSourceInterval());

But this requires that I have access to the parser, which means I have to pass it around.

Instead, I wrote a function that does something similar without having to use the parser:

public static String getFullText(ParseTree parseTree) {
    ParserRuleContext ctx = (ParserRuleContext)parseTree;

    if (ctx.children == null)
        return "";

    Token startToken = ctx.start;
    Token stopToken = ctx.stop;

    Interval interval = new Interval(startToken.getStartIndex(), stopToken.getStopIndex());
    String text = startToken.getInputStream().getText(interval);

    return text;
}

Seems to be doing what I need.

Could a similar method be added to ParseTree? Then I could just do this:

String expression = ctx.getFullText();

Thanks! :-)

ericvergnaud commented 7 years ago

Have you tried $parser?

Envoyé de mon iPhone

Le 8 août 2017 à 22:36, RedTailedHawk notifications@github.com a écrit :

Looks like I can get the full text (with whitespace) without having to use the parser. I was previously doing this:

String expression = parser.getTokenStream().getText(ctx.getSourceInterval());

But this requires that I have access to the parser, which means I have to pass it around.

Instead, I wrote a function that does something similar without having to use the parser:

public static String getFullText(ParseTree parseTree) { ParserRuleContext ctx = (ParserRuleContext)parseTree;

if (ctx.children == null)
    return "";

Token startToken = ctx.start;
Token stopToken = ctx.stop;

Interval interval = new Interval(startToken.getStartIndex(), stopToken.getStopIndex());
String text = startToken.getInputStream().getText(interval);

return text;

} Seems to be doing what I need.

Could a similar method be added to ParseTree? Then I could just do this:

String expression = ctx.getFullText();

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

RedTailedHawk commented 7 years ago

What is $parser?

ericvergnaud commented 7 years ago

A pseudo variable you can use in the grammar. Not sure it helps with listeners and visitors though.

sharwell commented 7 years ago

@RedTailedHawk This reference would add to the already very large memory overhead of the parse tree. Requiring that the instance be tracked separately was a big win for many use cases.

RedTailedHawk commented 7 years ago

It's "just" a reference though, no? i.e. a few bytes in each node?

Would it be any better if I requested a reference to the token stream instead of the parser? The token stream is what I really need in most cases, and currently the only way to get that is to pass the parser (or the token stream itself) into my visitors. Would be much easier if I could get at the token stream from the parse tree node. But I'm guessing this would incur the same overhead as the parser reference.

Thanks.

P.S. I submitted Pull Request #1986 for the ParserRuleContext.getFullText() method, as well as ParserRuleContext.getTokenStream().

RedTailedHawk commented 7 years ago

I also added ParserRuleContext.getChildren() and two other method overloads for this.

But I'm getting an error on the AppVeyor build: lambda expressions are not supported in -source 1.7.

I thought ANTLR supported Java 8.

https://stackoverflow.com/questions/27765391/antlr4-maven-and-java-1-8

RedTailedHawk commented 7 years ago

@sharwell Pull request #1986 stores the token stream in the parse tree, but only in the root node. That should help prevent bloating the parse tree.

Please review when you get a chance, thanks.

sharwell commented 7 years ago

Pull request #1986 stores the token stream in the parse tree, but only in the root node.

The reference takes up the same amount of space whether it's null or set to a value.

@parrt was working on a branch previously which provides a more flexible long-term solution. Rather than store the token stream, it actually tracks tokens from other channels as part of the TerminalNode. See antlr/antlr4#1667.

RedTailedHawk commented 7 years ago

Oh I see. :-(