yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
900 stars 112 forks source link

Memory leaks #60

Closed vehlwn closed 5 years ago

vehlwn commented 5 years ago

I was playing with this version of your parser https://github.com/yhirose/cpp-peglib/blob/57f866c6ca77f5a5afe37f72942d5526c45d7e87/peglib.h and accidentally found unlimited memory consumption. Consider this example:

#include <cpp-peglib/peglib.h>
#include <iostream>
#include <cstdlib>

using namespace peg;
using namespace std;

int main(int , char** ) try
{
    do  
    {
        function<long (const Ast&)> eval = [&](const Ast& ast) {
            if (ast.name == "NUMBER") {
                return stol(ast.token);
            } else {
                const auto& nodes = ast.nodes;
                auto result = eval(*nodes[0]);
                for (auto i = 1u; i < nodes.size(); i += 2) {
                    auto num = eval(*nodes[i + 1]);
                    auto ope = nodes[i]->token[0];
                    switch (ope) {
                        case '+': result += num; break;
                        case '-': result -= num; break;
                        case '*': result *= num; break;
                        case '/': result /= num; break;
                    }
                }
                return result;
            }
        };

        parser parser(R"(
            EXPRESSION       <-  TERM (TERM_OPERATOR TERM)*
            TERM             <-  FACTOR (FACTOR_OPERATOR FACTOR)*
            FACTOR           <-  NUMBER / '(' EXPRESSION ')'
            TERM_OPERATOR    <-  < [-+] >
            FACTOR_OPERATOR  <-  < [/*] >
            NUMBER           <-  < [0-9]+ >
            %whitespace      <-  [ \t\r\n]*
        )");

        parser.enable_ast();
        parser.enable_packrat_parsing();

        auto expr = " 2+2*2 ";
        shared_ptr<Ast> ast;
        if (parser.parse(expr, ast)) {
            ast = AstOptimizer(true).optimize(ast);
            //cout << ast_to_s(ast);
            //cout << expr << " = " << eval(*ast) << endl;
        }
    }
    while(0);

    return 0;
}
catch(const std::exception &ex)
{
    std::cerr << "Error: " << ex.what() << "\n";
    return 1;
}

(Built with g++ (i686-posix-dwarf-rev0, Built by MinGW-W64 project) 8.1.0, Win7, 'g++ -std=c++17 -Wall -Wextra -Wpedantic -g -O0 -fno-inline -fno-omit-frame-pointer -ggdb -isystemK:/1/0/source/cpp-peglib main.cpp -o a.exe'.) If I make infinite loop while(1), process memory in a few minutes grows up to 200 MB and even more. Program with while(0) requires less than 1 MB. There is full output of 'drmemory -- a.exe' https://pastebin.com/raw/5qkLW8Zj. Its important parts:

Error #1: LEAK 172 direct bytes 0x020fbd20-0x020fbdcc + 540 indirect bytes
peglib.h:3269 _ZZN3peg6parser10enable_astINS_7AstBaseINS_9EmptyTypeEEEEERS0_vENKUlRKNS_14SemanticValuesEE_clES8_ 

Error #3: LEAK 8 direct bytes 0x021051d0-0x021051d8 + 344 indirect bytes
peglib.h:2942 peg::AstBase<>::AstBase   

peglib.h:3269 in lambda: https://github.com/yhirose/cpp-peglib/blob/57f866c6ca77f5a5afe37f72942d5526c45d7e87/peglib.h#L3269

peglib.h:2942 in c-tor: https://github.com/yhirose/cpp-peglib/blob/57f866c6ca77f5a5afe37f72942d5526c45d7e87/peglib.h#L2942 Am I doing something wrong or is it error in your library? Could you fix it?

yhirose commented 5 years ago

@onodu, thanks for the bug report. I believe that the leak is caused by peglib.h, you are totally fine.

I doubt that the line ast = AstOptimizer(true).optimize(ast); creates the leak. Could you comment out the line and run the test program again to see if the leak still happens? Thanks for you help!

vehlwn commented 5 years ago

I removed line ast = AstOptimizer(true).optimize(ast); and it still leaks in peg::AstBase::AstBase at peglib.h:2942. There is drmemory output https://pastebin.com/raw/XWsD5ABy.

PS: drmemory also warns about possible leaks ("memory that is reachable only via pointers to the middle of the allocation, rather than the head") in peg::Action::operator() at peglib.h:617. Is it normal?

yhirose commented 5 years ago

@onodu, thanks for the additional test. I'll install the drmemory and try to debug.

hortegilles commented 5 years ago

i have also the same kind on memory leak , it seems relative to the rule on token

yhirose commented 5 years ago

Sorry that I couldn't have time to work on it, since I was so busy...

At least I found so far that the following lines are causing the memory leak.

peglib.h:3077 in AstOptimizer::optimize: https://github.com/yhirose/cpp-peglib/blob/57f866c6ca77f5a5afe37f72942d5526c45d7e87/peglib.h#L3077

peglib.h:3275 in parser::enable_ast: https://github.com/yhirose/cpp-peglib/blob/57f866c6ca77f5a5afe37f72942d5526c45d7e87/peglib.h#L3275

I'll let you know when I fix this problem. Thanks for your patience.

yhirose commented 5 years ago

@onodu, @hortegilles, I fixed it, and confirmed that the leak no longer happens on my machine.

It was due to the cyclic dependency in a AST tree. I changed to use std::weak_ptr for the parent member variable.

Could you try with the latest code on your machine when you have time? Thanks!

vehlwn commented 5 years ago

Yes. My example has stopped leaking with the new version. drmemory shows no LEAKs. That's great. Thanks.

yhirose commented 5 years ago

@onodu, glad to hear that it's now fixed. Thanks for the critical bug report!