llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.99k stars 11.95k forks source link

[clang][StaticAnalyzer] Missing used uninit warning after incomplete zeroing out #110166

Open antoniofrighetto opened 1 month ago

antoniofrighetto commented 1 month ago

Consider the following C++ code where the memset doesn't fully zero out vec:

#define THREADS 32
#define BLOCKS 2

int main() {
  float *vec = NULL;

  vec = (float *)malloc(BLOCKS * THREADS * sizeof(float));
  memset(vec, 0, BLOCKS * THREADS);

  for (int i = 0; i < THREADS * BLOCKS; ++i)
    vec[i] += i;

  printf("%f %f .. %f\n", vec[0], vec[1], vec[BLOCKS * THREADS - 1]);
  free(vec);
  return 0;
}

Both clang UninitializedValues and StaticAnalyzer checkers on trunk seemingly miss warning the uninitialized value usage.

Reduced version for UninitializedValues that GCC catches (https://godbolt.org/z/zTn6fE3Yb):

int main() {
    float *vec = NULL;
    vec = (float*)malloc(64);
    printf("%f", vec[0]);
    free(vec);
    return 0;
}
llvmbot commented 1 month ago

@llvm/issue-subscribers-clang-static-analyzer

Author: Antonio Frighetto (antoniofrighetto)

Consider the following C++ code where the memset doesn't fully zero out `vec`: ```c++ #define THREADS 32 #define BLOCKS 2 int main() { float *vec = NULL; vec = (float *)malloc(BLOCKS * THREADS * sizeof(float)); memset(vec, 0, BLOCKS * THREADS); for (int i = 0; i < THREADS * BLOCKS; ++i) vec[i] += i; printf("%f %f .. %f\n", vec[0], vec[1], vec[BLOCKS * THREADS - 1]); free(vec); return 0; } ``` Both clang UninitializedValues and StaticAnalyzer checkers on trunk seemingly miss warning the uninitialized value usage. Reduced version for UninitializedValues that GCC catches (https://godbolt.org/z/zTn6fE3Yb): ```c++ int main() { float *vec = NULL; vec = (float*)malloc(64); printf("%f", vec[0]); free(vec); return 0; } ```
steakhal commented 1 month ago

The store model cant represent memsets (default bindings) of common prefixes. So, you cant have two default bindings of different "lengths" binding different values. Here undef after the malloc, and then zero of a smaller prefix. There is no way around this, unless we make the store smarter. We have plans for it but let's not discuss that now.

The second case is a lot easier, and I'm surprised we don't find it already. We don't really handle floatingpoint numbers, so it may relate to that, but I haven't checked the case on godbolt.

pskrgag commented 1 month ago

@antoniofrighetto I believe, you choose wrong label for this bug. Clang static analyzer finds this bug (https://godbolt.org/z/es793bqzW), but clang's frontend does not.

So it should be clang:diagnostics, I believe

steakhal commented 1 month ago

@antoniofrighetto I believe, you choose wrong label for this bug. Clang static analyzer finds this bug (https://godbolt.org/z/es793bqzW), but clang's frontend does not.

So it should be clang:diagnostics, I believe

Half way. I bet we cant find the firdt example. And its because of our store model. We know about this, and at Sonar, we plan to make improvements there. We already have a really early store prototype that is pretty capable, but needs more time to get on par with the current store. Once we have that, we can start upstreaming. But its probably not this year.

steakhal commented 1 month ago

Relates to #43459

antoniofrighetto commented 1 month ago

The store model cant represent memsets (default bindings) of common prefixes. So, you cant have two default bindings of different "lengths" binding different values. Here undef after the malloc, and then zero of a smaller prefix. There is no way around this, unless we make the store smarter.

Independently of the store model currently not being able to handle granular bindings, I think a possible first approach to deal with this could verify that, when the symbolic region is a heap region and the memory buffer being memset'd coincides (when possible to say), warn uninit uses if the two sizes differ. evalMemset already takes into account this when the size is zero, and CheckLocation checks accesses past size within the destination buffer (when turned on). It doesn't seem excessively hard to extend it to consider such an instance, whilst may be still suboptimal for now (maybe cc/ @haoNoQ).

We don't really handle floatingpoint numbers, so it may relate to that, but I haven't checked the case on godbolt.

Doesn't look related to floating point expressions, we still don't issue a warning for the integer ones, but this is orthogonal to analyzer.

Relates to #43459

This looks only partially related with #43459 to me.