Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

scan-build: Single bug in a header file reported multiple times #16808

Open Quuxplusone opened 11 years ago

Quuxplusone commented 11 years ago
Bugzilla Link PR16809
Status NEW
Importance P normal
Reported by Filip Bártek (filip.bartek-spam@hotmail.com)
Reported on 2013-08-06 12:55:51 -0700
Last modified on 2015-07-13 09:59:21 -0700
Version trunk
Hardware PC Linux
CC filip.bartek-spam@hotmail.com, filip.bartek@hotmail.com, ganna@apple.com, jrose@belkadan.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments multireport-fastjet.tar.gz (265041 bytes, application/x-gzip)
Blocks
Blocked by
See also
Created attachment 10993
Scripts that reproduce the behavior; supplementary scripts; results of my test
run

Overview:
scan-build reports (in the HTML report) a single occurence of a bug several
times when run on fastjet v3.0.4.

Steps to Reproduce:
1) Download and extract fastjet v3.0.4:
   http://fastjet.fr/repo/fastjet-3.0.4.tar.gz
   http://fastjet.fr/
2) cd fastjet3.0.4
3) scan-build ./configure --enable-debug
4) scan-build -enable-checker alpha.cplusplus.VirtualCall make
Or you can simply execute the script "report.sh" after extracting the attached
archive. Inspect the script for more detailed steps.

Actual Results:
The bug VirtualCall in file "ClusterSequenceStructure.hh" on line 69 reported 9
times in the HTML report.
See "report-make/2013-08-06-1" in the attached archive.

Expected Results:
The bug VirtualCall in file "ClusterSequenceStructure.hh" on line 69 reported
only once in the HTML report.

Build Date & Platform:
2013-08-06
Linux 2.6.32-358.14.1.el6.x86_64 #1 SMP Wed Jul 17 08:30:19 CEST 2013 x86_64
x86_64 x86_64 GNU/Linux

Additional Information:

(Note that all the following paths are relative to the root directory of the
attached archive "multireport-fastjet.tar.gz".)

The bug is detected by the checker "alpha.cplusplus.VirtualCall".
The bug is detected in the file
"./fastjet3.0.4/include/fastjet/ClusterSequenceStructure.hh", in the definition
of function "ClusterSequenceStructure::ClusterSequenceStructure(const
ClusterSequence *)", on line 69.

The bug occurence is reported 9 times in total.
The bug is detected in a header file that is (indirectly) #included in more
than 9 of the analyzed ("scan-built") files.
The bug is reported 51 times (that is more than 9 times) in the error output of
the respective scan-build call. (Run "./count.sh" to see the number of
occurences in "run.log".)

Note that I tried implanting a non-alpha bug in the same place that the
VirtualCall bug is detected (i.e. file
"./fastjet3.0.4/include/fastjet/ClusterSequenceStructure.hh", definition of
function "ClusterSequenceStructure::ClusterSequenceStructure(const
ClusterSequence *)") and it was reported the same number of times. (Note that I
haven't attached the scripts that reproduce this behavior.)

The attached archive "multireport-fastjet.tar.gz" contains the following:
report.sh : Reproduce the behavior; requires scan-build, clang
setup-clang.sh : Setup clang and scan-build (from trunk); requires: svn, cmake
run.sh : Run setup-clang.sh and report.sh
run-log.sh : Run run.sh with logging to the file "run.log"
run.log : Log from my test run
report-make : HTML report from my test run
count.sh : Count the number of occurences of the bug in "run.log"
Quuxplusone commented 11 years ago

Attached multireport-fastjet.tar.gz (265041 bytes, application/x-gzip): Scripts that reproduce the behavior; supplementary scripts; results of my test run

Quuxplusone commented 11 years ago
This is a result of the analyzer running on a per-translation-unit basis --
there's no information shared across translation units, except for the index
scan-build creates at the end of execution. That means it doesn't actually know
that these issues are "the same".

This is a contentious point: in your case, the errors really are in the header
file, but in many cases the issues are the caller's fault, such as in this
example:

// foo.h
void require(bool *ptr) {
  if (!*ptr)
    abort();
}

// foo.cpp
void testA() {
  require(NULL); // null pointer dereference, caller's fault
}

void testB(bool *ptr) {
  if (ptr) // null check, but backwards!
    return;
  require(ptr); // null pointer dereference, caller's fault
}

There is some support for reporting the issues in the callers rather than in
the header with the hidden analyzer option "report-in-main-source-file=true",
but that won't help your current case. The CmpRuns.py script in the Clang
repository has some support for issue uniquing, but it hasn't been integrated
into scan-build.
Quuxplusone commented 11 years ago

scan-build runs the method ScanFile on each report HTML file. The method (when called repeatedly) removes the files that are redundant by their binary content (or its MD5 hash, to be more precise). (I suppose that this is the process that reduces the number of reports from 51 to 9.) However, the 9 HTML reports in question are not exact copies (although they refer to the same bug). They differ in the spelling they use to describe the path to the source file in question ("ClusterSequenceStructure.hh"). For example:

$ diff report-dc3ace.html report-3ed777.html 4c4 < ./../include/fastjet/ClusterSequenceStructure.hh

./../../include/fastjet/ClusterSequenceStructure.hh

64c64 <

74c74 < File:/afs/cern.ch/user/f/fbartek/sas/scanbuild/fastjet-report/fastjet-3.0.4/src/./../include/fastjet/ClusterSequenceStructure.hh

File:/afs/cern.ch/user/f/fbartek/sas/scanbuild/fastjet-report/fastjet-3.0.4/plugins/SISCone/./../../include/fastjet/ClusterSequenceStructure.hh

It looks like the HTML reports are generated by clang -analyzer-output=html called from scan-build.

I think that a good solution would be cancelling out the "." and ".." parts in the source file paths before they are printed to the HTML reports in clang -analyzer-output=html.

Quuxplusone commented 10 years ago
Would canceling out the "." and ".." unite the reports that are not united in
this case?
As Jordan points out, there are expected cases where we would issue multiple
reports in the same header file.
Quuxplusone commented 9 years ago

Yes, I believe that canceling out the "." and (especially) ".." would prevent the duplicity.