hedronvision / bazel-compile-commands-extractor

Goal: Enable awesome tooling for Bazel users of the C language family.
Other
689 stars 114 forks source link

Improve support of proprietary compilers #90

Closed Warchant closed 1 year ago

Warchant commented 1 year ago

This tool does not work with proprietary compilers which do not have gcc-compatible CLI interface. For now I made my private fork with "hardcoded" types of fixes. In this issue I will enumerate my changes:

  1. Here, -o or /Fo are expected. But one of compilers I tried uses --output_file=something/file.o. Other -fe something/file.o. Maybe potential solution is to give users ability to define output_flag="-o" value in bazel rule, then use it to search for output flag?
  2. Here because of "else" condition, all proprietary compilers try to run _get_headers_gcc and will obviously fail, because they won't have gcc-compatible flags. I excluded this call during my tests, not sure why it is required.
  3. Here you utilize some bazel internals to get the filename. More portable solution is to search for an argument which arg.endswith(_get_files.source_extensions). If this returns 1 candidate, then we found a file. Else, fallback to current approach, or fail.
  4. Here comment says that you output one entry per header. And I see _get_headers_gcc is necessary just to get a list of headers for this deduplication. In my solution I check entries for uniqueness here by primary key (directory, file). It does not require additional .d file generation and parsing, so it eliminates the need for _get_headers_gcc. It could be that I misunderstood the purpose of _get_headers_gcc...

With these points fixed and with support of #88, the tool works for multiple proprietary compilers.

cpsauer commented 1 year ago

Hey, @Warchant. Thanks for writing in again.

Sorry I'm a little slower than usual in getting back. I took some time off for the Thanksgiving holiday and am scrambling to catch back up. (If new, Thanksgiving is a day of being thankful for things one might otherwise take for granted. We were all humbled reading your reply about life in Kharkiv. Peace, electricity, freedom...all wonderful things, too easy to take for granted. Thank you for taking the time to share; our hearts are with you.)

Anyway, back to trying to help!

I'd like to understand a bit better about what you're aiming to do here, one step up the backtrace of your goals. The reason I'm asking is that I'm not sure that clangd is going to do what you want here, since it only really understands gcc/msvc flags. So I better understand, can I ask what you're using this for, what these compilers are, etc? Basically, I'd like to figure out if and how we can best support your goal here.

In the meantime, I'll try to add some context for each in turn:

  1. I'm assuming it logged a warning asking you to file an issue. Thanks for doing so! [More explanation in 2.]
  2. all this _get_headers stuff is to work around Clangd not determining header flags from source inclusion during indexing. Without augmenting compile_commands.json with headers, clangd has a tendency to get language wrong in headers in various edge cases, especially when there are multiple platforms/languages in the project and with header-only libraries or system libraries. So we enrich the compilation database with header info by default, even though this wasn't the original intent of the compile_commands.json spec. This is what all the references to https://github.com/clangd/clangd/issues/123 in the comments are about. If you don't need the tool to identify headers for your use case, set exclude_headers = "all", as in the README, and it won't run the msvc/clang args-format dependent header extraction--no source changes required. Conversely, linking back to (1), it's not clear to me that it makes sense to support header extraction for proprietary flag schemes that clangd won't support, but if it does, I'd be open to merging in cases for them. One thing we probably shouldn't do, though, is throw an error in the default case, since lots of toolchains have the annoying habit of wrapping clang in scripts that we need to call through.
  3. I wrote it that way originally actually, but there's a trap: Some platforms have include directories with cpp extensions. Sad times, I know. For more context, see https://github.com/hedronvision/bazel-compile-commands-extractor/pull/37. (Note that the commit was actually merged; GitHub just didn't pick it up.) But if you have ideas on how to make it more general, I'm definitely interested in merging them in. Also, there's more context on this issue and some other possible approaches in the comment just above.
  4. I think we'd better discuss this one based on your your responses and after you read the above, since I'm not quite understanding. But to note: We expect the aquery calls to only return compile commands for source files. Sometimes we'll have multiple commands per source file if it's compiled multiple ways, and we're carrying that into compile_commands.json just in case clangd wants to make intelligent use of it someday. Currently it just picks one command per file. (See #4). The suppression of duplicates for headers is to stop the workaround described in (2) from leading to massive compile_commands, since lots of system headers are (transitively) included by almost all source files.

Cheers, Chris

Warchant commented 1 year ago

I'd like to understand a bit better about what you're aiming to do here, one step up the backtrace of your goals.

My goal is to just generate compile_commands.json, without an intention of running clangd on it. I tried:

I would LOVE to run clangd, clang-tidy and related tools... But at this point I need it for TICS.

All of that is proprietary :(

cpsauer commented 1 year ago

Got it! Thanks. Can't say I've used most of those, so I extra appreciate your linking me to them.

Bummer that they aren't (all) flag compatible with MSVC/gcc. If they were close, I'd propose that we add code to massage their flags to be clang compatible so you could run those tools, but it sounds like there are too many and/or the clang tools aren't enough of a goal?

cpsauer commented 1 year ago

I'm guessing you don't need compile commands entries for headers for the static analysis tool?

If that's right, then I'd propose we resolve this as follows, so we can get you everything you want while keeping you on the mainline and continue to collaborate and get you future fixes:

cpsauer commented 1 year ago

@Warchant, please do let me know what you think of that approach when you get a chance! (I know you've got, ahem, a lot going on around you, so no pressure--just thought of you while I was tweaking my fix to your other issue.)

cpsauer commented 1 year ago

Hey @Warchant--haven't heard from you in a while. I'm hoping you're OK!

I was about to implement the workaround for (3) that I'd proposed above, but the directory check isn't actually sufficient, since the arg with a source extension can be generated directory and therefore not yet exist. Indeed that's exactly the example in #37. My bad.

So I've implemented exactly your original suggestion. Hopefully that plus exclude_headers = "all" gets this working for your use case. If you want the clang tools to work, the next step would be to add _platform_patch functions for your compilers, massaging their flags into clang compatible ones that the tooling can understand.

I'm going to close this for now, since I haven't heard from you in a good while, but I can open this right back up if you reply and want it.