hedronvision / bazel-compile-commands-extractor

Goal: Enable awesome tooling for Bazel users of the C language family.
Other
697 stars 115 forks source link

External execroot cache invalidated when generating compile commands #201

Open kgreenek opened 5 months ago

kgreenek commented 5 months ago

edit: See below comments where I dug in a bit deeper and found a potentially broader issue.

Hi! I love this project and have been using it successfully for many years.

Recently I upgrade protobuf and now I am running into the following issue.

Whenever there is a protobuf c++ target, hedron_compile_commands cannot build the compile_commands.json successfully. It seems to be using the wrong version of probotuf somehow. This is weird because the bazel build succeeds. It's only the hedron refresh_compile_commands target that fails.

I created a minimal example repo with instructions to reproduce the issue here: https://github.com/kgreenek/hedron_pb_bug_repro

I verified that when running the example binary, the protobuf version found at compile time is the expected newer version. So I believe somehow hedron is finding some older headers.

I tried removing all protobuf headers from my system to see if those were being found somehow. However, hedron still found the wrong version of protobuf somehow and I saw the same errors.

kgreenek commented 5 months ago

I found a work-around and a clue!

Workaround: After running bazel run //:refresh_compile_commands, you have to run bazel build //... again. This fixes the bazel cache.

It appears running refresh_compile_commands causes some symlinks in the bazel cache to be invalidated somehow.

Some more detail on what I'm seeing:

The printed error is caused because the file google/protobuf/port_def.inc is not being found correctly. I originally thought this might be due to a messed up include path, but I confirmed that the generated compile_commands.json file has the correct include path to find the port_defs.inc file.

The correct include path for that porf_def.inc file is: bazel-out/k8-opt/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/port_def

The port_defs.inc file is part of a cc_library which has a strip_include_prefix argument. See: https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/BUILD.bazel#L252

That is why the include path is a generated _virtual_includes directory with a bunch of symlinks under it.

The symlinks for the _virtual_includes directory are valid after I run bazel build //.... I checked this by running:

ls -l bazel-out/k8-opt/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/port_def/google/protobuf/port_def.inc

After I run bazel run //:refresh_compile_commands, I run the ls command again. Then I see that the symlink is broken. It points to a file that no longer exists.

Now I run bazel build //... again. This fixes the cache. After this my editor's autocomplete features work as expected.

So it appears refresh_compile_commands somehow causes the bazel cache to break these symlinks.

kgreenek commented 5 months ago

I dug a bit deeper and found that almost all of the symlinks in the external execroot cache directory are deleted after running bazel run //:refresh_compile_commands. In my case I can see that by checking this directory:

ls /home/kgk/.cache/bazel/_bazel_kgk/ebcadef48e04fd13b9d8d47e7ced60b7/execroot/example/external

Here is what the output looks like before running the hedron command:

bazel_tools      com_google_protobuf      local_config_cc  rules_cc    zlib
com_google_absl  hedron_compile_commands  platforms        utf8_range

And here is the output after:

bazel_tools  hedron_compile_commands  local_config_cc  platforms

I tried this on a larger repository, and sure enough there were hundreds of external dependencies that were missing symlinks after running refresh_compile_commands. It is all fixed by rebuilding.

kgreenek commented 5 months ago

This issue looks related: https://github.com/bazelbuild/bazel/issues/10680

kgreenek commented 5 months ago

I confirmed that if I don't invoke the refresh_compile_commands script with bazel run, then I do not see those errors.

I.E. I do this:

bazel build //...
BUILD_WORKSPACE_DIRECTORY=$PWD ./bazel-bin/refresh_compile_commands
dyng commented 3 months ago

I confirmed that if I don't invoke the refresh_compile_commands script with bazel run, then I do not see those errors.

I.E. I do this:

bazel build //...
BUILD_WORKSPACE_DIRECTORY=$PWD ./bazel-bin/refresh_compile_commands

Thank you, you saved my day!

faximan commented 1 week ago

I think this is a dupe of https://github.com/hedronvision/bazel-compile-commands-extractor/issues/140?

My understanding is that when you bazel build //:refresh_compile_commands the symlink target path in ~/.cache/bazel/_bazel_$USER is pruned to only include the dependencies of the last build, breaking protobuf and other project dependencies. So that would explain why bazel build //... works, as you compile both your project and the refresh_compile_commands binary.

And this is supposed to be fixed in Bazel 7.1, unfortunately my project is stuck on 7.0 for now...