llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.48k stars 11.77k forks source link

[14 regression][clang static analyzer core] Bogus core.uninitialized.Assign #54945

Open LebedevRI opened 2 years ago

LebedevRI commented 2 years ago

https://godbolt.org/z/jb574cz4E

In file included from /home/lebedevri/rawspeed/fuzz/librawspeed/decompressors/HuffmanTable/Solo.cpp:36:
/home/lebedevri/rawspeed/src/librawspeed/decompressors/HuffmanTableTree.h:61:20: warning: Assigned value is garbage or undefined [core.uninitialized.Assign]
      partial.code <<= 1;
                   ^   ~
1 warning generated.
Compiler returned: 0

partial.code can not be garbage or undefined there. It does not really say why it thinks that way.

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-static-analyzer

haoNoQ commented 2 years ago

This bisects down to 0a17896fe6fd so it looks like some advanced algebra was involved in the proof. It's still on us to explain that algebra to the user when we report the warning.

I'll debug this a bit further to see if the result was correct in the first place.

Also a semi-regular reminder that --analyzer-output text produces explanation, which is basically mandatory for understanding the report. But in this case it doesn't seem to be sufficient either.

haoNoQ commented 2 years ago

So the LHS in our case is

(((derived_$1215 | (conj_$1212 << ((64U - derived_$1213) - 32U))) >> 63U) != 0U

(expression type in the AST is uint16_t, promoted to int before computation) and the RHS is 1 S32b.

And we have constraints:

(derived_$1215 | (conj_$1212 << ((64U - derived_$1213) - 32U))) >> 63U : [0, 0]

So the LHS is simplified to 0 U1b , and then it tries to compute 0 U1b << 1 S32b which yields no result on APSInts:

Process 23916 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x000000010906626b clang`(anonymous namespace)::SimpleSValBuilder::evalBinOpNN(this=0x000000012fbb8c90, state=clang::ento::ProgramStateRef @ 0x00007ff7bfef9e98, op=BO_Shl, lhs=NonLoc @ 0x00007ff7bfef9e00, rhs=NonLoc @ 0x00007ff7bfef9e30, resultTy=QualType @ 0x00007ff7bfef9df8) at SimpleSValBuilder.cpp:493:14
   490
   491          const llvm::APSInt *Result =
   492            BasicVals.evalAPSInt(op, LHSValue, RHSValue);
-> 493          if (!Result)
   494            return UndefinedVal();
   495
   496          return nonloc::ConcreteInt(*Result);
Target 0: (clang) stopped.
(lldb) p Result
(const llvm::APSInt *) $34 = nullptr
(lldb) p LHSValue
(llvm::APSInt) $35 = {
  llvm::APInt = {
    U = {
      VAL = 0
      pVal = 0x0000000000000000
    }
    BitWidth = 1
  }
  IsUnsigned = true
}
(lldb) p RHSValue
(llvm::APSInt) $36 = {
  llvm::APInt = {
    U = {
      VAL = 1
      pVal = 0x0000000000000001
    }
    BitWidth = 32
  }
  IsUnsigned = false
}

Basically it says that bit shift of a 1-bit value by 1 is undefined. Which kind of makes sense from the point of view of BasicValueFactory but makes little sense from the point of view of the original code.

At a glance this looks like a type error to me. The LHS should be S32b, not U1b. Note that the original symbolic expression is a comparison, so it's naturally boolean. But because casts aren't included, we can't really tell that from the original expression alone.

I think one possible solution could be to perform simplification before type promotion. This way if we simplify a type-ambiguous symbolic expression into a compile-time constant we can recover type-correctness. The ideal solution, of course, is to have everything type-correct from the start, but even with a perfect constraint solver we'll still have a lot of explaining to do.