[CPP] Cache keep growing and will not be reset while parsing multiple files, just behaves like memory leak

antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

http://antlr.org

BSD 3-Clause "New" or "Revised" License

17.11k stars 3.28k forks source link

[CPP] Cache keep growing and will not be reset while parsing multiple files, just behaves like memory leak #2182

Open garzon opened 6 years ago

garzon commented 6 years ago

Reference: https://github.com/antlr/antlr4/issues/499 At first I thought it was memory leak, however I finally find out that it is a case that the "cache" will not be release and soon grow to hundreds megabytes(with keeping the reference to a lot of ATNConfig and DFAState) after parsing some files.

I come up with a way to bypass that, like this, in grammar files for both lexer and parser:

@lexer::members {
void cleanCache() {
    _interpreter->clearDFA();
    _sharedContextCache.clear();
}
}

and call this method every time before I parse a file.

Maybe something like this method should be provided by runtime library, since the cache members are both static private and I cannot find any document about this problem.

mike-lischke commented 6 years ago

This is not related to the C++ target alone, but a general problem. A cache usually is used to avoid doing things mulitple times, but use results from a previous run to be fast. Clearing the cache counters that and will hence lower the parse performance.

What needs to be done IMO is to review the approach in general and find a balance between speed and memory consumption.

chund commented 6 years ago

Anyway it would be great to have such functions at least in debug build to verify memory leaks in own code (The ANTLR caches produce a lot of noise) Is it sufficient to clear the mentioned, or is there more?

chund commented 6 years ago

I have added the following to my .g4 file, and called it after parsing. But it does not yet seem to clear all cached data:

//custom extension of lexer object @lexer::members { void cleanCache() { std::lock_guard lck(_mutex); _interpreter->clearDFA(); _sharedContextCache.clear(); } }

//custom extension of parser object @parser::members { void cleanCache() { std::lock_guard lck(_mutex); _interpreter->clearDFA(); _sharedContextCache.clear(); } }