llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.98k stars 11.94k forks source link

[analyzer] Crash using `clang_analyzer_explain()` in the `debug.ExprInspection` checker #57270

Open isuckatcs opened 2 years ago

isuckatcs commented 2 years ago

The following snippet causes the Static Analyzer to fail an assertion in debug mode, or crash in release mode.

struct S {
    static int a;
    ~S(){};
};
int S::a = 0;

void foo() {
    S::a = 0;

    int x = 3;
    memset(&x, 1, sizeof(x));

    S *arr = new S[x];
    delete[] arr;

    clang_analyzer_explain(S::a);
}

memset sets the value of S::a to derived_$12{conj_$8{int, LC1, no stmt, #1},a}, and later when clang_analyzer_explain() tries to explain the value, it encounters a nullptr which it dereferences.

conj_$8{int, LC1, no stmt, #1} doesn't contain any statement, so when SValExplainer::VisitSymbolConjured() calls SValExplainer::printStmt(const Stmt *S) it passes a nullptr to it as an argument. The called function assumes it always receives a valid pointer, and it simply calls S->printPretty() without any validation, and this is when the crash happens.

For more information please see godbolt.

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-static-analyzer

llvmbot commented 2 years ago

@llvm/issue-subscribers-good-first-issue

haoNoQ commented 2 years ago

Interesting, I think it's a bigger problem than just the debug checker. Conjured symbols with no statements are basically indistinguishable from each other, so it sounds like multiple consecutive memsets will invalidate with the same symbol, which is incorrect.

We definitely have a few other situations where such symbols show up (most notably, conservatively evaluated destructors) and that's definitely a problem because we simply don't have a better symbol to represent them (see also https://discourse.llvm.org/t/memory-region-invalidation-tracking-improvements/62432), but in this case there's no reason why CStringChecker woundn't simply pass the call-expression of memset to the symbol. So this one's probably an easy fix.

chaitanyav commented 1 year ago

@haoNoQ @isuckatcs is this issue open to work?.

isuckatcs commented 1 year ago

I guess it is. My desired solution would be to fix SValExplainer so that it can handle conjured symbols with no statement. After that the issue with CStringChecker can also be fixed. However, since conjured symbols with no statement can come from multiple sources, SValExplainer needs to be fixed regardless.

haoNoQ commented 1 year ago

We probably shouldn't aim for conjured symbols with null statement as a permanent solution, because that's not a correct solution; we need a way to properly discriminate between these symbols through, we can't be agglutinating them when they are coming from different sources each of which is a null statement.

At the very least, we should try replacing statement pointer with CFGElementRef; that's a generalized notion of a statement and it comes in handy because it always exists. (The discourse thread I linked above had much better approaches, but it sounds like it's stuck.)

chaitanyav commented 1 year ago

@krishenm94 do you want to work on this issue?

Unique-Usman commented 7 months ago

@chaitanyav @isuckatcs , I am a newbie LLVM compiler, is this something which I can work on ? Thank you.