TrustInSoft / tis-interpreter

An interpreter for finding subtle bugs in programs written in standard C
565 stars 28 forks source link

malloc() returns null mode #107

Open kroeckx opened 8 years ago

kroeckx commented 8 years ago

A recent commit said: "prepare malloc-returns-null mode"

Can you describe that a little more?

One of the problems in checking openssl right now is that the error path for malloc() failures never gets checked, while we can check various of the other error cases.

pascal-cuoq commented 8 years ago

As of the latest commit, you should be able to use this mode. Just add the option --malloc-returns-null to an existing tis-interpreter.sh command-line.

You should only add it once the interpreter is already set up and working well in its default “malloc always succeeds” mode. Even so, the new malloc results can cause some more of the target code to be explored, and you may realize that a function is called that wasn't necessary before.

One thing that may happen is that the interpreter will just use up time and memory, and will after a while look like it is only doing the same thing hundreds of times. This is not something that can easily be avoided at the moment, it just means that the test's structure is a bad fit for this mode. Let me illustrate on one example:

#include <stdio.h>
#include <stdlib.h>

int counter;

int test_1(void) {
  int *p = malloc(sizeof(int));
  if (!p) return -1;
  *p = 1;
  printf("%d", *p);
  free(p);
  counter++;
  return 0; // all went well
}

int test_2(void) {
  float *p = malloc(sizeof(float));
  if (!p) return -1;
  *p = 2.0f;
  printf("%f", *p);
  free(p);
  counter++;
  return 0; // all went well
}

int test_3(void) {
  char *p = malloc(sizeof(char));
  if (!p) return -1;
  *p = 'a';
  printf("%c", *p);
  free(p);
  counter++;
  return 0; // all went well
}

int main()
{
  test_1();
  test_2();
  test_3();
}

Analyzing this program in “malloc always succeeds”:

$ tis-interpreter.sh t.c   
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed

1

2.000000

a
[value] done for function main

Analyzing this program in “malloc returns NULL” mode:

$ tis-interpreter.sh t.c --malloc-returns-null
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed

1

2.000000

2.000000

a

a

a
[value] done for function main

As you can see, the interpreter gets more and more work to do, because the later tests are invoked for more and more possible past sequences of allocations successes and failures. Some OpenSSL tests are exactly like this: they call functions that can fail, and the functions are properly written to clean up after themselves and return an error code, but the tests do not check these error codes and continue nonetheless. These tests will not do well in tis-interpreter.sh --malloc-returns-null. Just do not run them until a better solution is found.

One solution is to “improve” the tests. Here, that would be:

$ cat t.c
…
int main()
{
  if (test_1() < 0) return 1;
  if (test_2() < 0) return 2;
  if (test_3() < 0) return 3;
}

$ tis-interpreter.sh t.c --malloc-returns-null
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed

1

2.000000

a
[value] done for function main

About interactions between “malloc returns NULL” mode and the filesystem emulation: it happens to work rather well. Allocation failures in the filesystem code are usually correctly translated to failures of the call that was made, giving the opportunity to test what happens in the target code when these calls fail. I only tested this a little, though, so it may not be perfect yet.

pascal-cuoq commented 8 years ago

I should add that unfortunately, you will not have the satisfaction of finding many new bugs this way, because I already used this mode while it was half-working and reported the bugs it found. This was one example.

Something that should be usable soon next is the detection of memory leaks in conjunction with --malloc-returns-null, and there should be some low-hanging fruit there (memory leaks that only happen when malloc returns NULL some of the time would be difficult to identify any other way, I think).

pascal-cuoq commented 8 years ago

The memory leak detection that I alluded to in my previous comment is ready for testing.

Consider the program below:

#include <stdlib.h>

/* if you need the test to be compilable by C compilers,
   test for TIS_INTERPRETER. */
#ifdef TIS_INTERPRETER
#include <tis-interpreter.h>
#endif

void f(void) {
  char *a = malloc(3);
  char *p = malloc(4);
  if (!p) goto fail;
  char *q = malloc(5);
  if (!q) goto fail;

  /* work work */

  free(q);
  free(p);
 fail:
  free(a);
}

int main(void) {
  f();
#ifdef TIS_INTERPRETER
  tis_show_allocated();
#endif
}

If you compile and execute with Valgrind, Valgrind will tell you that it does not leak memory.

Executing normally with tis-interpreter.sh leads to the same result: the program is fine when all its allocations succeed:

$ tis-interpreter.sh memleak.c 
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
memleak.c:27:[value] remaining allocated variables:
[value] done for function main

The nice feature here is that the new function tis_show_allocated() is compatible with the --malloc-returns-null mode (for those tests for which --malloc-returns-null uses reasonable time and memory):

$ tis-interpreter.sh memleak.c --malloc-returns-null 
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
memleak.c:27:[value] remaining allocated variables:
memleak.c:27:[value] remaining allocated variables:
          __malloc_f_l11
memleak.c:27:[value] remaining allocated variables:
          __malloc_f_l11_0
memleak.c:27:[value] remaining allocated variables:
[value] done for function main

This is telling you that for some interlacings of malloc successes and failures, there is a memory leak in the program memleak.c! The memory block that failed to be freed was allocated in function f, at line 11 in the file where f is defined.

The program can be fixed as shown:

#include <stdlib.h>

/* if you need the test to be compilable by C compilers,
   test for TIS_INTERPRETER. */
#ifdef TIS_INTERPRETER
#include <tis-interpreter.h>
#endif

void f(void) {
  char *a = malloc(3);
  char *p = malloc(4);
  if (!p) goto fail;
  char *q = malloc(5);
  if (!q) goto fail2;

  /* work work */

  free(q);
 fail2:
  free(p);
 fail:
  free(a);
}

int main(void) {
  f();
#ifdef TIS_INTERPRETER
  tis_show_allocated();
#endif
}

Once fixed, no execution path remains where memory leaks exist:

$ tis-interpreter.sh memleak.c --malloc-returns-null
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
memleak.c:28:[value] remaining allocated variables:
memleak.c:28:[value] remaining allocated variables:
[value] done for function main

I decided to use a function tis_show_allocated() so that it's possible to test for memory leaks between two points of the program (just insert a call at each point, and compare the lists of allocated blocks). One thing I particularly dislike about Valgrind is how it encourages developers, just before exiting, to free every allocated block that was alive for the entire execution of the program, making binary code larger and slower for everyone.

Still, I can also make tis-interpreter display the remaining allocated variables at the end of the program, Valgrind-style, if that is more convenient (no intervention in the source code).

pascal-cuoq commented 8 years ago

Here are some additional options that may be useful in conjunction with tis_show_allocated():

Sometimes allocations are made through some generic allocation functions, which makes the name of the fake variable created by malloc uninformative:

$ cat memleak.c
#include <stdlib.h>

/* if you need the test to be compilable by C compilers,
   test for TIS_INTERPRETER. */
#ifdef TIS_INTERPRETER
#include <tis-interpreter.h>
#endif

void *generic_allocation_function(size_t n) {
  return malloc(n);
}

void f(void) {
  char *a = generic_allocation_function(3);
  char *p = generic_allocation_function(4);
  if (!p) goto fail;
  char *q = generic_allocation_function(5);
  if (!q) goto fail;

  /* work work */

  free(q);
  free(p);
 fail:
  free(a);
}

int main(void) {
  f();
#ifdef TIS_INTERPRETER
  tis_show_allocated();
#endif
}

$ tis-interpreter.sh memleak.c --malloc-returns-null  
…
memleak.c:31:[value] remaining allocated variables:
memleak.c:31:[value] remaining allocated variables:
          __malloc_generic_allocation_function_l10_0
memleak.c:31:[value] remaining allocated variables:
          __malloc_generic_allocation_function_l10_1
memleak.c:31:[value] remaining allocated variables:

All allocations are traced back to line 10, inside the generic allocation function, which for a large program would not help in identifying the leak. It is possible to avoid the function generic_allocation_function being identified as the allocation site, but instead its caller, by using the option -val-malloc-functions:

$ tis-interpreter.sh memleak.c --malloc-returns-null -val-malloc-functions generic_allocation_function                      
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
memleak.c:31:[value] remaining allocated variables:
memleak.c:31:[value] remaining allocated variables:
          __malloc_f_l15
memleak.c:31:[value] remaining allocated variables:
          __malloc_f_l15_0
memleak.c:31:[value] remaining allocated variables:
[value] done for function main

With this option, again we get the information that the leaking memory block was allocated at line 15 in function f.

Another useful option is -val-show-allocations, which shows the entire call-stack that is current when a new allocation takes place. When the name of a memory block is shown as leaked, it can be looked up in order to understand the circumstances in which it was allocated:

$ tis-interpreter.sh memleak.c --malloc-returns-null -val-show-allocations
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
memleak.c:10:[value] allocating variable __malloc_generic_allocation_function_l10 of type char [3]
        stack: malloc :: memleak.c:10 <-
               generic_allocation_function :: memleak.c:14 <-
               f :: memleak.c:29 <-
               main
memleak.c:10:[value] allocating variable __malloc_generic_allocation_function_l10_0 of type char [4]
        stack: malloc :: memleak.c:10 <-
               generic_allocation_function :: memleak.c:15 <-
               f :: memleak.c:29 <-
               main
memleak.c:10:[value] allocating variable __malloc_generic_allocation_function_l10_1 of type char [4]
        stack: malloc :: memleak.c:10 <-
               generic_allocation_function :: memleak.c:15 <-
               f :: memleak.c:29 <-
               main
memleak.c:10:[value] allocating variable __malloc_generic_allocation_function_l10_2 of type char [5]
        stack: malloc :: memleak.c:10 <-
               generic_allocation_function :: memleak.c:17 <-
               f :: memleak.c:29 <-
               main
memleak.c:10:[value] allocating variable __malloc_generic_allocation_function_l10_3 of type char [5]
        stack: malloc :: memleak.c:10 <-
               generic_allocation_function :: memleak.c:17 <-
               f :: memleak.c:29 <-
               main
memleak.c:31:[value] remaining allocated variables:
          __malloc_generic_allocation_function_l10_1
memleak.c:31:[value] remaining allocated variables:
          __malloc_generic_allocation_function_l10_0
memleak.c:31:[value] remaining allocated variables:
memleak.c:31:[value] remaining allocated variables:
[value] done for function main