llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.66k stars 11.85k forks source link

False positives related to __builtin_clzll() and std::vector #21276

Open llvmbot opened 10 years ago

llvmbot commented 10 years ago
Bugzilla Link 20902
Version unspecified
OS Linux
Attachments scan-build error report demonstrating the bug, The C++ program (build with "clang test.cc -std=c++11 -lstdc++"). Identical to the "cat test.cc" output in the bug summary.
Reporter LLVM Bugzilla Contributor
CC @belkadan

Extended Description

Running scan-build on this program yeilds a warning about the result of the left-shift expression being undefined, which appears to be spurious, and a note that the 'true' branch of an 'if' is taken, which is definitely invalid. I've attached the scan-build error report. Running the program confirms the obvious: the 'false' branch of the 'if' is taken. The expected output is no bug reports and (optionally) a note that the 'false' branch of the 'if' is taken.

------ Details ------

I created a fresh Ubuntu 14.04.1 install. I then installed the packages clang-3.5 and g++ with apt-get.

wschudy@ubuntu:~$ cat test.cc

include

include

inline int Log2Floor64(unsigned long long n) { return n == 0 ? -1 : 63 ^ __builtin_clzll(n); }

int main(int argc, char** argv) { std::vector xs = {0}; if (xs[0] != 0) { unsigned long long count = 1ull << Log2Floor64(5.0); printf("\"Then\" branch, i.e. xs[0] != 0, with count == %llu\n", count); } else { printf("\"Else\" branch, i.e. xs[0] == 0\n"); } }

wschudy@ubuntu:~$ clang --version Ubuntu clang version 3.5-1ubuntu1 (trunk) (based on LLVM 3.5) Target: x86_64-pc-linux-gnu Thread model: posix wschudy@ubuntu:~$ clang test.cc -std=c++11 -lstdc++ wschudy@ubuntu:~$ ./a.out "Else" branch, i.e. xs[0] == 0 wschudy@ubuntu:~$ scan-build clang test.cc -std=c++11 -lstdc++ scan-build: Using '/usr/bin/clang' for static analysis test.cc:11:37: warning: The result of the '<<' expression is undefined unsigned long long count = 1ull << Log2Floor64(5.0);


1 warning generated.
scan-build: 1 bugs found.
scan-build: Run 'scan-view /tmp/scan-build-2014-09-10-214330-4211-1' to examine bug reports.
llvmbot commented 10 years ago

I just looked at the FAQ and it looks like I misunderstood what the "Note: Taking true branch" message means. I guessed that that message was warning me that scan-build had proved that the false branch was never taken (i.e. dead code). After seeing the example warnings in the FAQ it seems that message is not an independent warning but rather an explanation of under what circumstances it thinks the error about the "<<" being undefined may occur. Therefore never mind what I said about the taking true branch note being wrong. Instead make this bug about the error about the "<<" being undefined, which is a false positive. I believe this false positive is caused by scan-build (apparently) not modeling double to unsigned long long conversion (i.e. 5.0 to 5ull), so it can't rule out Log2Floor64 receiving n=0 and returning -1, which in turn would make the left-shift undefined. My evidence for this is if you replace the problematic line with the following three the error message stays the same but uncommenting the assert makes it go away: unsigned long long five = 5.0; // assert(five>0); //uncommenting this line causes error report to go away unsigned long long count = 1ull << Log2Floor64(five);

Now that I (mostly) understand the bug I have four suggestions. These suggestions are mostly independent so you may want to make additional bugs for one or more of them to keep them separate.

  1. Model double to integral type conversion so this false positive goes away.

  2. Please add a page to the "user manual" menu on the webpage describing how to interpret the error reports, including the fact that the notes are statements of the branches taken in the error report, not a statement that the value of the tests can be determined statically. If a user receives multiple correct error reports it's possible to deduce what the error reports mean but a single error report, especially a false positive one, is ambiguous so documentation would help.

  3. The error message says that the result "is undefined", which can be naturally interpreted as a statement that the analyzer has proven that undefined behavior will definitely occur, which is wrong in this case. If the analyzer is uncertain about whether or not undefined behavior is occurring it should say the result "may be undefined". The same suggestion applies to all error messages: make it clear whether the error message is for a certain error or a possible error.

  4. The error message would be a lot more useful if it were more specific about the problem than just the left-shift being undefined. For example state that scan-build deduced that the right-hand operand was bounded between -1 and 63 inclusive (or whatever scan-build actually deduced), which contains value(s) below the allowed range 0 to 63 inclusive.

belkadan commented 10 years ago

The analyzer isn't actually running your program, so /everything/ it prints is based on its model. Each path note, including those for branches, is meant to explain the analyzer's assumptions at each point---the analyzer /is/ taking that branch, even though the real program can't.

llvmbot commented 10 years ago

Jordan:

  1. AIUI C++ does narrowing conversions automatically in many contexts. (This is a very questionable design decision but we have to live with it.) I'm not a language lawyer but I guess that clang is correct to allow this code. Anyway replacing "5.0" with the explicit cast "static_cast(5.0)" does not change the diagnostics produced so if clang is accepting this code incorrectly that's a separate bug.

  2. It's OK for clang static analyzer to stay silent or give ambiguous warnings because it isn't modeling vector and clzll. It's not OK for it to give diagnostics that confidently state something, i.e. which branch of an 'if' is taken, that is incorrect.

belkadan commented 10 years ago

I'm not even sure how that's compiling, because 5.0 isn't an unsigned long long. But yes, the analyzer does not model std::vector right now, and it probably isn't modeling clzll either. The latter would be pretty easy to fix, though.

llvmbot commented 10 years ago

assigned to @tkremenek