antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
16.98k stars 3.26k forks source link

[Go Target] Memory Leak in ParserATNSimulator #2088

Open ereyes01 opened 6 years ago

ereyes01 commented 6 years ago

I was in process of porting the PHP grammar to Go when I happened upon a memory leak during parsing. The Go memory profiler told me the following:

image

The memory profile seems to suggest some kind of infinite recursion between closureWork <--> closureCheckingStopState.

EDIT: updated image to use the svg from the repository linked to below that reproduces this issue.

ereyes01 commented 6 years ago

Hi folks, I keep running into this issue with ANTLR4 in my project (which involves doing some simple static analysis in several programming languages), and it's been a significant impairment to fully adopting ANTLR smoothly in my project (written in Go).

I now have a relatively simple case that reproduces this issue that I have shared here: https://github.com/ereyes01/antlr-php-example

^^ This repository contains the full code, instructions to reproduce, and a full heap profile SVG image you can look at if you do not wish to run your system out of memory.

It involves the PHP grammar, which I have ported to Go (mainly, I just translated the grammar actions to Go), and a reproducing test: a single line of PHP code that nests several parser rules.

I think the parser rule that is most nested in the test is this one (due to the long chain of . operators): https://github.com/ereyes01/antlr-php-example/blob/master/parser/PhpParser.g4#L498

Is there a problem in the grammar that is exacerbating this problem? Or is this a bug in the Go target code? I owe a beer or 3 to whomever helps me out... thanks!

KvanTTT commented 6 years ago

Hi!

I parsed your file deepConcatanation.php with optimized C# runtime without any problem (less than 100 Mb and less than 0.2 sec).

I think this issue related to deep recursive rules. See discussion #1398 and java patch https://github.com/antlr/antlr4/pull/1404.

Not sure this optimization has been implemented in Go runtime. So, I think this is a bug in runtime.

ereyes01 commented 6 years ago

Thanks @KvanTTT that was an interesting read. I frequently encounter the deep recursive rules mentioned in the #1398 discussion in a variety of grammars from the grammars-v4 repository, so this sounds like a likely suspect in this case.

sunkin351 commented 5 years ago

I have some serious memory problems using the Java 8 Grammar with this C# runtime... I'm using the latest nugget package for .NET Core. The more Java source files I parse in one batch, the larger the memory leak becomes. Its come to over 8GB and I had to shut it down cause it was nowhere near done.

sunkin351 commented 5 years ago

I'm using Nugget package version 4.7.2 if you need to know.

KvanTTT commented 5 years ago

Try to use java grammar instead. It's much faster and also supports Java 8. Also, don't forget about cache clearing.

siddhesh503 commented 4 years ago

@KvanTTT I am also hitting this issue for deep recursive rules. I tried porting the https://github.com/antlr/antlr4/pull/1404 java code to golang but the problem remains