BNFC / bnfc

BNF Converter
http://bnfc.digitalgrammars.com/
586 stars 165 forks source link

Memory leak in C parser #347

Closed wangjia184 closed 3 years ago

wangjia184 commented 3 years ago

BNFC version 2.9.1

Here is the code generated by BNFC

/* Entrypoint: parse Proc from string. */
Proc psProc(const char *str)
{
  YY_BUFFER_STATE buf;
  mylang__init_lexer(0);
  buf = mylang__scan_string(str);
  int result = yyparse();
  mylang__delete_buffer(buf);
  if (result)
  { /* Failure */
    return 0;
  }
  else
  { /* Success */
    return YY_RESULT_Proc_;
  }
}

I successfully call this method, and then called free() on the returned pointer. but valgrind says there is 16KB memory leaked.

Error Leaked 16.1 kiB
Info at malloc
     at mylang_alloc (Lexer.c:2509)
     at mylang__create_buffer (Lexer.c:2047)
     at mylang_restart (Lexer.c:1987)
     at mylang__init_lexer (mylang.l:124)
     at psProc (mylang.y:197)
     at rho_runtime::interpreter::compiler::builder::build_ast (src/interpreter/compiler/builder.rs:15)
     at rho_runtime::interpreter::compiler::test (src/interpreter/compiler/mod.rs:22)
     at rho_runtime::main (src/main.rs:10)
     at core::ops::function::FnOnce::call_once (function.rs:227)
     at std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:125)
     at std::rt::lang_start::{{closure}} (rt.rs:66)
Summary Leaked 16.1 kiB total

The leak is from mylang__create_buffer (yy_create_buffer) line 2047 below.

/** Allocate and initialize an input buffer state.
 * @param file A readable stream.
 * @param size The character buffer size in bytes. When in doubt, use @c YY_BUF_SIZE.
 * 
 * @return the allocated buffer state.
 */
YY_BUFFER_STATE yy_create_buffer  (FILE * file, int  size )
{
    YY_BUFFER_STATE b;
    b = (YY_BUFFER_STATE) yyalloc( sizeof( struct yy_buffer_state )  );  // <--- this line leaks
    if ( ! b )
        YY_FATAL_ERROR( "out of dynamic memory in yy_create_buffer()" );

    b->yy_buf_size = size;
    b->yy_ch_buf = (char *) yyalloc( (yy_size_t) (b->yy_buf_size + 2)  );
    if ( ! b->yy_ch_buf )
        YY_FATAL_ERROR( "out of dynamic memory in yy_create_buffer()" );
    b->yy_is_our_buffer = 1;
    yy_init_buffer( b, file );
    return b;
}

It is strange that psProc() did call mylang__delete_buffer(yy_delete_buffer) .

Is this a bug? Or I used it incorrectly?

wangjia184 commented 3 years ago

I avoid this issue by using the other edition pProc which accepts FILE* as parameter.

Before that, I need convert the C-string into a FILE* by fmemopen(), there is no memory leak any longer.

andreasabel commented 3 years ago

I successfully call this method, and then called free() on the returned pointer.

Maybe you didn't mean this literally, but a simple free(p) won't usually do the job, you need to recursively deallocate the subtrees of p, I suppose.

It is strange that psProc() did call mylang__delete_buffer(yy_delete_buffer) .

This seems to be the correct use pattern, see also the definition of SExpression *getAST(const char *expr) at https://en.wikipedia.org/wiki/GNU_Bison.

(Maybe there is a better reference for using the buffer than Wikipedia, if you happen to find a good tutorial, please link it here.)

wangjia184 commented 3 years ago

Maybe you didn't mean this literally, but a simple free(p) won't usually do the job, you need to recursively deallocate the subtrees of p, I suppose.

Yep, I did that. In my initial test, the AST tree only contains a single root node, And it still says 16KB memory leaked.

The real code traverses AST tree and free all of them. source code is here, C parser is called by Rust via FFI, it is the same how we call it from C. The source code is here: https://github.com/wangjia184/rholang-rust/blob/main/rho-runtime/src/interpreter/compiler/normalize/mod.rs

Let me put it simple

So I convert a char*str to a FILE* file by using fmemopen() to solve it : https://github.com/wangjia184/rholang-rust/blob/main/rho-runtime/src/interpreter/compiler/builder.rs

andreasabel commented 3 years ago

Thanks for getting back to me @wangjia184!
I could confirm the leak. (Could be related to https://westes.github.io/flex/manual/Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e.html#Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e.) However, just calling yylex_destroy as in https://www.javaer101.com/en/article/37968820.html did not do the trick.

The problem is that the generated init_lexer (FILE* inp) calls yyrestart(inp) even if inp is null. (See https://github.com/BNFC/bnfc/commit/73b10de91720d90d98370b7e4ac3a25aa9848a43#r48434527.) yyrestart allocates a buffer if there is none yet. But then we are not reading from a file but allocate the buffer ourselves with yy_scan_string. This seems to just overwrite the existing buffer structure without freeing it first.

Not calling yyrestart if we are not parsing from a file seems to fix the leak.

andreasabel commented 3 years ago

Note to myself: Here is a modified driver StringCalc for the Calc.cf example that parses from a string:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "Parser.h"
#include "Printer.h"
#include "Absyn.h"

void usage(void) {
  printf("usage: Call with one of the following argument combinations:\n");
  printf("\t--help\t\tDisplay this help message.\n");
  printf("\t(text)\t\tParse text verbosely.\n");
  printf("\t-s (text)\tSilent mode. Parse text silently.\n");
}

int main(int argc, char ** argv)
{
  Exp parse_tree;
  int quiet = 0;
  char *text = NULL;

  if (argc > 1) {
    if (strcmp(argv[1], "-s") == 0) {
      quiet = 1;
      if (argc > 2) {
        text = argv[2];
      }
    } else {
      text = argv[1];
    }
  }

  if (!text) {
      text = "1 + 2 * 3";
      /* usage(); */
      /* exit(1); */
  }

  parse_tree = psExp(text);
  if (parse_tree)
  {
    printf("\nParse Successful!\n");
    if (!quiet) {
      printf("\n[Abstract Syntax]\n");
      printf("%s\n\n", showExp(parse_tree));
      printf("[Linearized Tree]\n");
      printf("%s\n\n", printExp(parse_tree));
    }
    return 0;
  }
  return 1;
}

This is the valgrind instruction to show the leak:

valgrind --leak-check=full --show-leak-kinds=definite ./StringCalc 

TODO: add tests that check for memory leaks in the testsuite.

andreasabel commented 3 years ago

@wangjia184: The release of 2.9.2 will probably be in a few months only, so if you rely on this fix, please install the master branch. Thanks for reporting!

Upstream issue (flex): https://github.com/westes/flex/issues/479

andreasabel commented 3 years ago

There are still some space leaks, e.g. in tests

andreasabel commented 3 years ago

Well, dunno, the memory leaks show only (and non-deterministically) when running the testsuite, not when running the tests individually. So, it is not clear whether these are leaks indeed.