hedronvision / bazel-compile-commands-extractor

Goal: Enable awesome tooling for Bazel users of the C language family.
Other
689 stars 114 forks source link

Same compile command repeated many times. #106

Closed SoftwareApe closed 1 year ago

SoftwareApe commented 1 year ago

Version used: abdd06e05c7949721dba4bf1ae465bde16b9d3e1

First of all, thank you for providing such an easy to use tool, which also provides nice debugging output that tells you which generated files are missing.

However I noticed the compile_commands.json produced is huge for my project.

Looking at the compile_commands.json there seems to be an issue.

  1. I can look at the arguments and folder and find the same compile_command repeated 400 times. The only thing that's different on every entry is the "file" part of the compile command.
  2. The "file" is some header, that's maybe part of the translation unit, but not the source file in question.
cpsauer commented 1 year ago

Hey, Carl! Thanks for trying the tool, for writing in, and for being kind.

We're outputting one command per header because clangd needs them to robustly avoid a variety of error cases, especially some around directly opening header-only libraries. (More details in https://github.com/clangd/clangd/issues/123 if you're really curious. Some discussion also in this section of the README.)

If you don't need or want the entries for the headers, just pass in exclude_headers = "all", as per this section.

Let me know if that resolves things for you--or if there are more things I should know!

Cheers, Chris

SoftwareApe commented 1 year ago

@cpsauer thank you, I indeed didn't see these in the readme. I activated both exclude_external_sources == True and exclude_headers = "all", excluding external sources and excluding headers and it works for me.

load("@hedron_compile_commands//:refresh_compile_commands.bzl", "refresh_compile_commands")
refresh_compile_commands(
    name = "refresh_compile_commands",
    exclude_external_sources = True,
    exclude_headers = "all",
)
cpsauer commented 1 year ago

Sweet! Carl, for my continued learning: what are you using the compile commands for, if not clangd?

SoftwareApe commented 1 year ago

For SonarQube static analysis, so we only need to analyze internal targets, and only one compile command per translation unit.

cpsauer commented 1 year ago

Got it! Thanks. Lots more people are using this for SonarQube than I'd have guessed when I first wrote it!

Heads that you might sometimes still get multiple commands for the same file if it's compiled multiple ways, e.g., cross compiled for multiple platforms. If that's a meaningful problem, please do lmk.

Are there any other needs for use with SonarQube that I should know about? Or any tricky setup things we should document? I'm curious also how you found this tool!

SoftwareApe commented 1 year ago

@cpsauer I found it by Googling for compile_commands.json and Bazel. I found a few tools and this one seemed to get the most references and be the most maintained.

If there are multiple versions of the same translation unit I think that's fine, since they might cause different static analysis results. I only wanted to avoid having the same analysis executed 400 times.

In terms of setup it was quite easy to get to running with the refresh_all command.

For static analysis one thing that could be interesting is producing only the generated files, which worked with this command for us:

$bazel build $(bazel query 'filter(".*\.(?:cpp|hpp|inl|c|h|cc|hh|cxx|hxx)$", kind("generated file", //...:*))')

This generates all the C or C++ file targets without running any actual build on them, which is quite neat, so we don't need to wait for a full build to finish. You currently mention running a build for them to exist, but this is more minimal, so the static analysis can start before the build has finished, speeding up our CI a lot.

SoftwareApe commented 1 year ago

@cpsauer small follow-up. I got far enough to get SonarQube to start it's analysis. However I got an issue with symlink loops due to the external folder.

I noticed this folder is created by the compile commands tooling.

>>> Automatically added //external workspace link:

Looking through the code it would be nice to be able to disable this symlinking.

SoftwareApe commented 1 year ago

follow-up follow-up, without the external folder, the sonar-scanner works, but doesn't find some files 🤕

SoftwareApe commented 1 year ago

Replacing the link to external in the compile commands with the fully qualified target seems to make SonarQube happy enough.

sed -i "s|\"external/|\"$(readlink -f external)/|g" compile_commands.json
cpsauer commented 1 year ago

Thanks a bunch for getting back to me, Carl!

Replying to each in turn.

Thanks again, Chris

SoftwareApe commented 1 year ago

@cpsauer Thank you! I'm still in the process of getting things to work and not 100% there yet, so any pointers are helpful 👍 .

I've noted a link to https://github.com/bazelbuild/bazel/issues/17660 in my code, in case the file generation breaks things in the future. Especially since cross-compilation is likely going to be integrated in the next few months. Right now it seems to work for us. But if in the future there's an issue here at least we know where to look.

Regarding the external link I don't like this hack either, and I can see how it could create issues. I will ask SonarSource to clarify why their scanner crashes here. I looked at the error message more in detail, and it seems to be the actual issue is a permissions issue. SonarQube tries to index everything within the workspace, but some files linked through external don't have read permissions. I can't find a setting to exclude these properly in SonarQube though.

SoftwareApe commented 1 year ago

I had a simpler idea for resolving this issue while waiting for a fix from SonarSource. This also keeps the paths the same, and therefore doesn't thrash the cache.

# Create an external symlink outside the indexed directory
ln -s $(readlink -f external) ../external
# Replace all references to it in compile_commands.json
sed -i "s|\"external/|\"../external/|g" compile_commands.json
## Remove symlinked folder where the sonar-scanner tries indexing it
rm -rf external
cpsauer commented 1 year ago

:) Thanks for being great, Carl. You're very welcome; thanks for your persistence and ingenuity and for communicating with SonarSource to make this better for everyone.

[I'm surprised to hear about the lack of read permissions...is there more I should know there?]

SoftwareApe commented 1 year ago

Hi Chris,

interesting update here. Moving external to ../external works, and SonarQube doesn't seem to care so much about read permissions outside the indexed folder. This generated some more log messages regarding the symlink loops.

I'm getting this log here:

skipping symbolic link <project-dir>/../external/glew/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glew/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/tinyxml2/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/tinyxml2/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/tinyxml2/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pcl/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pcl/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pcl/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/curl/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/curl/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/curl/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/gl/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/gl/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/gl/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/sdl2/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/sdl2/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/sdl2/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glvnd/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glvnd/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glvnd/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pngpp/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pngpp/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/pngpp/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/console_bridge/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/console_bridge/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/console_bridge/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/opencv/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/opencv/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/opencv/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/egl/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/egl/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/egl/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openssl/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openssl/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openssl/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openexr/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openexr/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/openexr/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/zlib/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/zlib/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/zlib/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/atomic/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/atomic/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/atomic/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/ncurses/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/ncurses/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/ncurses/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glut/bin/X11/X11/X11/X11/X11/X11 -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glut/lib/x86_64-linux-gnu/hdf5/openmpi/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
skipping symbolic link <project-dir>/../external/glut/lib/x86_64-linux-gnu/hdf5/serial/lib/lib/lib/lib/lib/lib -- too many levels of symbolic links.
13:08:02.662 WARN: Property 'sonar.cxx.cobertura.reportPaths': cannot find any files matching the Ant pattern(s) '<project-dir>/../**/*-coverage-report.xml'

Indeed all these folders have symbolic links to . which seems kind of loopy to me.

Regarding the file permissions there seem to be some files (I'm not sure what these are) that don't have read permissions, e.g.

external/pcl/NX symlinks to /usr/NX, and /usr/NX/var/db/ports seems have no read permissions for users.

It's just that SonarQube too eagerly tries to read everything without a fallback in case it's not possible. I'm guessing the average non-Bazel project doesn't trigger such corner cases, because usually you can read everything in your source folder.

cpsauer commented 1 year ago

Huh! Bizarre. Indeed feels very unnecessarily loopy. Surprised they're set up that way; I haven't seen structures like that in my repos. I wonder what's setting up those triple-based paths. Perhaps rules_foreign_cc calling some external build system. If you do discover why, I'd be curious, but a SonarQube ignore function--or a softer failure--does seems like the fix for structures like that.