llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.02k stars 11.96k forks source link

clang static analyzer, false positive core.NullDereference #59531

Open nbriggs opened 1 year ago

nbriggs commented 1 year ago

scan-build reports a false positive for a NULL dereference in analyzing (attached) badanalysis.c scan-build is using clang-15 on FreeBSD:

$ /usr/local/llvm15/bin/clang --version     
clang version 15.0.6
Target: i386-portbld-freebsd13.1
Thread model: posix
InstalledDir: /usr/local/llvm15/bin
% scan-build clang -c -o /var/tmp/badanalysis.o badanalysis.c
scan-build: Using '/usr/local/llvm15/bin/clang-15' for static analysis
badanalysis.c:18:17: warning: Access to field 'next' results in a dereference of a null pointer (loaded from variable 'pcp') [core.NullDereference]
      pcp->next = cp->next;
      ~~~       ^
1 warning generated.
scan-build: Analysis run complete.
scan-build: 1 bug found.

The analysis seems to be foiled by the procedure call in if (rightone(cp)), if this is replaced by, for example if (cp->v == v) it does not report this NULL dereference.

badanalysis.c.gz

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-static-analyzer

balazs-benics-sonarsource commented 1 year ago

Yup, rightone() might look like this:

int rightone(struct cursor *c) {
  static struct cursor mycursor;
  cursorlist = &mycursor; // mutates the global!
  return 1;
}

Since the analyzer cannot see the body of that function, it will assume the worst, so we invalidate all assumptions of global variables. This is the intended behavior for such cases. In this case, it leads to a FP, which is very sad. We actually want to apply some heuristics to get rid of such cases, but it's a difficult problem. One can read more about the problem and the proposed solution on the mailing list.

Until we fix/mitigate this I would recommend avoiding the use of global variables.

nbriggs commented 1 year ago

Shouldn't the static declaration of cursorlist let the analyzer deduce, in this case, that a separately compiled function can't be modifying it as you suggested?

balazs-benics-sonarsource commented 1 year ago

For the record, here is the code from the attached archive: https://godbolt.org/z/es14Me1Ms

struct cursor {
    int v;
    cursor *next;
};

extern cursor *cursorlist;
int rightone(cursor *c);

cursor *showfail(int v) {
    cursor *pcp = 0;
    cursor *cp = cursorlist;

    while (cp != 0) {
        if (rightone(cp)) {
            if (cp == cursorlist || pcp == cursorlist) {
                return cp;
            }
            pcp->next = cp->next; // pcp is assumed to be null
            cp->next = cursorlist;
            cursorlist = cp;
            return cp;
        }
        pcp = cp;
        cp = cp->next;
    }
    return 0;
}

Shouldn't the static declaration of cursorlist let the analyzer deduce, in this case, that a separately compiled function can't be modifying it as you suggested? That's right for the most cases, where the static global's address doesn't escape from the translation unit.

However, as soon as we have some code like this:

/* exported function */ cursor *escapeAddress() {
    return cursorlist;
}

We no longer can assume that the static global is only accessible from the current translation unit, even though it is static. But I agree, that we should be able to deduce this, and skip invalidating static global variables unless it could have escaped by any means.