The current implementation of hash_inputs() does not always detect if the code to be analyzed has changed. It seems that the issue can be traced to how #24 modified the hash_inputs() modification.
For example, consider the following clang-tidy invocation (using a compilation database):
clang-tidy -p build_dir foo-lib/foo.cpp
Upon parsing the compilation database to extract the compiler arguments, the compiler command line would look like:
hash_inputs will then calculate the hash based on the modification time of the file foo-lib/foo.cpp. If the headers included by that file have changed since last run this will not impact the computation of the hash and clang-tidy-cache will use the cached information if available.
Discussion
Prior #24 the compiler command line was actually executed to collect the output of the preprocessor (which was then filtered using adjust_chunks()). The advantage of this approach is that any changes in the translation unit lead to a different hash, and I assume this is the desired behavior.
Note that, #24 was motivated by issues observed in VSCode+Ninja but it is not clear (to me at least) what was the real bottleneck.
Would it be possible to actually revert these changes to avoid the issue?
Thoughts:
24 introduced a better way to find the location of the .clang-tidy file via a dedicated argument (--directories_with_clang_tidy=) and this should probably be retained. That makes it possible to simplify the parsing of the compiler command output.
During the parsing of the compiler arguments -c is replaced with -E. Should this be -E -P to automatically remove linemarkers (# lines)? This way there is no need anymore to parse line by line the output and this can be used to directly update the hash. This might improve performance which was the original issue in #24.
The use of -E -P instead of -E to bypass calling adjust_chunks() on the compiler output somewhat relates to one of the item in #23 as this make it possible to avoid touching the compiler output.
I can open a PR but I first wanted to discuss the issue as this is essentially asking for reverting a big part of #24.
Description of the problem
The current implementation of
hash_inputs()
does not always detect if the code to be analyzed has changed. It seems that the issue can be traced to how #24 modified thehash_inputs()
modification.For example, consider the following clang-tidy invocation (using a compilation database):
Upon parsing the compilation database to extract the compiler arguments, the compiler command line would look like:
hash_inputs
will then calculate the hash based on the modification time of the filefoo-lib/foo.cpp
. If the headers included by that file have changed since last run this will not impact the computation of the hash andclang-tidy-cache
will use the cached information if available.Discussion
Prior #24 the compiler command line was actually executed to collect the output of the preprocessor (which was then filtered using
adjust_chunks()
). The advantage of this approach is that any changes in the translation unit lead to a different hash, and I assume this is the desired behavior.Note that, #24 was motivated by issues observed in VSCode+Ninja but it is not clear (to me at least) what was the real bottleneck.
Would it be possible to actually revert these changes to avoid the issue?
Thoughts:
24 introduced a better way to find the location of the
.clang-tidy
file via a dedicated argument (--directories_with_clang_tidy=
) and this should probably be retained. That makes it possible to simplify the parsing of the compiler command output.-c
is replaced with-E
. Should this be-E -P
to automatically remove linemarkers (#
lines)? This way there is no need anymore to parse line by line the output and this can be used to directly update the hash. This might improve performance which was the original issue in #24.-E -P
instead of-E
to bypass callingadjust_chunks()
on the compiler output somewhat relates to one of the item in #23 as this make it possible to avoid touching the compiler output.I can open a PR but I first wanted to discuss the issue as this is essentially asking for reverting a big part of #24.