Sarcasm / compdb

The compilation database Swiss army knife
MIT License
292 stars 23 forks source link

C headers treated as C++ #15

Open collinjackson93 opened 5 years ago

collinjackson93 commented 5 years ago

I work in a code base where there is a mixture of C and C++. I noticed that C headers (files that are guarded by extern "C") are being treated like C++ files, which causes tooling like Clang Tidy to apply C++ specific checks to these files. I believe the issue could happen specifically when C headers are included in C++ files.

Sarcasm commented 5 years ago

Not sure but it could be that you have the c++-mode enabled in these headers instead of c-mode.

collinjackson93 commented 5 years ago

Thanks for the quick response! Do you mind elaborating on c-mode and c++-mode? Where would that setting be configured and at what level (build tools, compdb, etc)?

Sarcasm commented 5 years ago

Haha, I'm sorry, I thought this issue was on another project of mine! :smile:

Well, I assume it is not wrong that the header is C++. What does the compile command entry looks like for this file?

Regarding clang-tidy, I think the header file should not be checked, instead the main file should be checked. By "main file", I mean that if the header is foo.h, the main file would be foo.cpp, or a unit test for foo.h if it's only a header, or any other file in the project if this file has no main file. As long as a header is used by a translation unit, and clang-tidy's header-filter is set correctly, then you should not need to check the header file directly.

The header compilation database is more useful to text editors for things like completion I think. But I'm not sure static analysis or refactoring needs it.

collinjackson93 commented 5 years ago

The command looks like this for the header: "command": "/usr/local/bin/clang++ -std=c++11 -Werror -pthread -fPIC -fopenmp -fcolor-diagnostics -O0 -g -DDEBUG -fsanitize=address -march=native -c /home/cjackson/DataStructures/Public/ObjectBuffer.h",

And its associated implementation file looks like this: "command": "/usr/local/bin/clang -std=c11 -Werror -pthread -fPIC -fopenmp -fcolor-diagnostics -O0 -g -DDEBUG -fsanitize=address -fPIC -march=native -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-pedantic -Wno-c99-extensions -Wno-format-non-iso -Wno-covered-switch-default -Wno-switch-enum -Wno-strict-prototypes -Wno-missing-prototypes -Wno-global-constructors -Wno-exit-time-destructors -Wno-vla -Wno-zero-length-array -Wno-undef -Wno-padded -Wno-weak-vtables -Wno-missing-noreturn -Wno-zero-as-null-pointer-constant -Wno-deprecated -Wno-old-style-cast -Wno-reserved-id-macro -Wno-unused-macros -Wno-inconsistent-missing-destructor-override -Wno-reorder -Wno-unused-parameter -Wno-sign-compare -Wno-sign-conversion -Wno-shorten-64-to-32 -Wno-documentation -Wno-documentation-unknown-command -Wno-bad-function-cast -Wno-format-nonliteral -Wno-cast-align -Wno-unreachable-code-break -Wno-unreachable-code-return -Wno-disabled-macro-expansion -Wno-cast-qual -Wno-double-promotion -o DataStructures/CMakeFiles/DataStructures.dir/Private/ObjectBuffer.c.o -c /home/cjackson/DataStructures/Private/ObjectBuffer.c",

And here's the command for one of the CPP files that includes the C header: "command": "/usr/local/bin/clang++ -std=c++11 -Werror -pthread -fPIC -fopenmp -fcolor-diagnostics -O0 -g -DDEBUG -fsanitize=address -march=native -o Infrastructure.Tests/CMakeFiles/Infrastructure.Tests.dir/ObjectBufferTests.cpp.o -c /home/cjackson/Infrastructure.Tests/ObjectBufferTests.cpp",

Based on this one example, it appears that the commands for the CPP file overrode the ones for the header's implementation file. What do you think about giving precedence to the commands that are associated with the implementation file if it can be determined? As a first step, a naive implementation could just look for a file with the same name, but with a different file extension (i.e., .cpp instead of .h)

We actually got benefit from running tidy over headers directly instead of just using the header filter. For one, it found some headers that were missing necessary includes because every implementation file it was included in had those missing includes. Also, we run tidy as part of our CI, but for efficiency, we only run it over files that have changed. When someone changed only a header, we previously weren't able to validate those changes.

Sarcasm commented 5 years ago

What do you think about giving precedence to the commands that are associated with the implementation file if it can be determined? As a first step, a naive implementation could just look for a file with the same name, but with a different file extension (i.e., .cpp instead of .h)

This is already done:

>>> from compdb.complementer.headerdb import score_other_file
>>> score_other_file('/home/cjackson/DataStructures/Public/ObjectBuffer.h', '/home/cjackson/DataStructures/Private/ObjectBuffer.c')
20
>>> score_other_file('/home/cjackson/DataStructures/Public/ObjectBuffer.h', '/home/cjackson/Infrastructure.Tests/ObjectBufferTests.cpp')
10
>>> score_other_file('/home/cjackson/DataStructures/Public/ObjectBuffer.h', '/home/cjackson/DataStructures/Public/ObjectBuffer.c')
70

The last entry is just to show the prevalence of a file in the same directory with just a different extension. So it looks like there is a bug somewhere else. A minimal reproducing example, e.g. a CMake project to clone, could help.

For one, it found some headers that were missing necessary includes because every implementation file it was included in had those missing includes.

A general rule to guard against this is to have the first include in the main file be the header, see https://llvm.org/docs/CodingStandards.html#include-style.

Also, we run tidy as part of our CI, but for efficiency, we only run it over files that have changed. When someone changed only a header, we previously weren't able to validate those changes.

I think it would be more correct to check all the files this header is included-by (directly or transitively), as a change in a header file, might produce clang-tidy warnings in a file using it.