xr0-org / xr0

The Xr0 Verifier for C
https://xr0.dev
Apache License 2.0
173 stars 4 forks source link

leaks not caught where there is a parameter reference #48

Closed akiarie closed 3 months ago

akiarie commented 4 months ago

Xr0 currently accepts code like

#include <stdlib.h>

leak(void *p) ~ [ p = malloc(1); ]
{
        p = malloc(1);
}

because the reference checking treats parameters like the return value. It should only accept cases where there is a double pointer, so that the caller would have a reference, like

#include <stdlib.h>

notleak(void **p) ~ [
        setup: p = .clump(1);
        *p = malloc(1);
]{
        *p = malloc(1);
}
akiarie commented 4 months ago

superficial fixing of this messes up setups with mallocs, as in this test

akiarie commented 4 months ago

I think in order to fix this we have to distinguish between caller resources (those created in setup) and local ones. Both clump and heap allocations done in setup are caller resources in that they set requirements for what the caller must be able to access. A malloc that takes place in setup can't be the cause of a leak. So the leak checking should focus only on the locally-allocated regions.

The two places a reference to an allocated region can be placed are in caller memory (clump or heap) or within the return value. So the checking for leaks should loop through all the locally-allocated blocks on the heap and check that they are addressed in one of these places.

akiarie commented 4 months ago

This is not a leak but is regarded as one because we aren't recognising that the reference to the clump could still be accessible to the caller.

#include <stdlib.h>

struct composite {
    void *p;
};

notleak(struct composite *p) ~ [
        setup: p = .clump(1);
        p->p = malloc(1);
]{
        p->p = malloc(1);
    p = NULL;
}
akiarie commented 4 months ago

To summarise, the basic idea with checking for leaks is that we track locally-allocated regions on the heap and insist that they remain accessible. A region is accessible if and only if we can find a path to it from the return value, the clump or the caller-heap (regions of the heap allocated in setup).

So the main changes are:

  1. The caller-heap regions should be ignored during leak checking
  2. Parameters should similarly be ignored together with everything on the stack
  3. All the blocks in the caller's memory (the clump and the caller-heap) should be checked for references, just like the return value. If there is a reference somewhere in the caller's memory then it's not a leak because the caller can "see" this reference.

(3.) may be the most subtle here. I think it's generally true though, even in cases where the side-effect of calling a function may be to create a leak, e.g.:

#include <stdlib.h>

struct composite {
    void *p;
};

allocsub(struct composite *p) ~ [
        setup: p = .clump(1);
        p->p = malloc(1);
]{
        p->p = malloc(1);
}

f()
{
       struct composite *c;

       c = malloc(sizeof(struct composite));
       c->p = malloc(1);
       allocsub(c); /* leaks region addressed by c->p */
       free(c->p);
       free(c);
}

because there could be another reference within the caller to c->p prior to the call to allocsub.