bazelbuild / rules_cc

C++ Rules for Bazel
https://bazel.build
Apache License 2.0
189 stars 94 forks source link

Add binary link deps attribute to cc_library #194

Open tpudlik opened 1 year ago

tpudlik commented 1 year ago

Description of the problem / feature request:

I would like Bazel to provide a solution to the "linktime transitive dependency" problem. This is a problem commonly encountered in implementing low-level libraries like logging or assertions as part of larger middleware projects like Pigweed. It's described in detail in the next section.

The proposed solution (described in more detail below, too) is to introduce a label-valued binary_link_deps attribute on cc_library, with the following semantics: Whenever a cc_binary transitively depends on a library with binary_link_deps = [":sometarget"], add ":sometarget" to the link_extra_lib of that cc_binary.

Feature requests: what underlying problem are you trying to solve with this feature?

Problem description

Low-level libraries like Pigweed or Abseil define assertion and logging primitives. This naturally leads to dependency cycles: you would like to use string utilities, IO abstractions, and other libraries in implementing logging and assertions; but the string utilities, IO abstractions, etc, themselves invoke logging and assertions in their implementation. How to have your cake (use assertions/logging throughout your project) and have it, too (use your project's libraries in implementing assertions/logging)?

Specific example of a dependency cycle from Pigweed (the names are target labels and arrows deps relations):

pw_assert -> pw_assert_basic -> pw_assert_basic_impl -> pw_assert_basic_handler -> //pw_string:builder -+
      ^                                                                                                 |
      |                                                                                                 |
      +-------------------------------------------------------------------------------------------------+

One solution to this problem is to split up the assertion API (headers) and assertion implementation (source files) into separate build targets, and delete the dependency edge between them:

pw_assert -> pw_assert_basic    pw_assert_basic_impl -> pw_assert_basic_handler -> //pw_string:builder -+
      ^                                                                                                 |
      |                                                                                                 |
      +-------------------------------------------------------------------------------------------------+

There is now no cycle, but you have a different problem: when building a cc_binary that transitively depends on pw_assert, you will be missing the definitions in the source files unless you add a dep on pw_assert_basic_impl somewhere. So, your binary will compile, but will fail to link.

Workarounds

There are some workarounds for this problem, but they're not satisfactory:

  1. You can add deps on pw_assert_basic_impl manually to any cc_binary or cc_test targets. This is not great: the linker failure is fairly obscure, and this is certain to regularly trip up developers, especially ones new to the project.
  2. You can add pw_assert_basic_impl to link_extra_lib. In practice you'd probably set @bazel_tools//tools/cpp:link_extra_lib in bazelrc. This is better, but you'll end up linking the implementation library even when building targets that don't need it (have no dep on the assertion library at all). This is especially debilitating in a large monorepo.

Impact

Solving this problem would be very useful to Pigweed; https://issues.pigweed.dev/234877642 is the tracking issue on our side.

But my understanding is that the same problem affects Abseil. Abseil has a similar problem to Pigweed: they define a logging library, but that library has dependencies on other parts of Abseil that themselves want to do logging. They use a separate macro, ABSL_INTERNAL_LOG, in other Abseil modules to avoid a cyclic dependency. The function actually called by that macro is registered by setting a pointer; by default it points to the minimal RawLog implementation, but that can be overridden if a richer log implementation is available at link time. But how does one ensure that the richer implementation is, in fact, available? This is left to the user to ensure, by calling RegisterInternalLogFunction in a dependency of their binary and test targets.

More broadly, this could be useful in any collection of libraries that implements low-level utility routines (like logging and assertions) to be used in other parts of the collection.

Proposed solution

Introduce a label-valued binary_link_deps attribute on cc_library, with the following semantics: Whenever a cc_binary transitively depends on a library with binary_link_deps = [":sometarget"], add ":sometarget" to the link_extra_lib of that cc_binary.

This resolves the problem: instead of just removing the dependency edge between pw_assert_basic and pw_assert_basic_impl (the header target and the source target), we replace it with the binary_link_deps relation. This does not make pw_assert_basic_impl a dependency of pw_assert_basic, and so does not produce a cycle. But it does ensure that any binary that depends on pw_assert_basic will be linked with pw_assert_basic_impl.

This proposal is for illustrative purposes only. I've not prototyped it and am certainly open to other solutions!

lberki commented 1 year ago

cc @buildbreaker2021 @comius

lberki commented 1 year ago

FWIW, the proposed solution does not break the dependency cycle because you the proposed binary_link_deps attribute could go two ways:

  1. It could be of the type LABEL. In that case, it would create a dependency edge, and therefore you'd have the very same dependency cycle.
  2. It could be of the type NODEP_LABEL. In that case, there would be no way to create a dependency edge from the eventual cc_binary to the library you want to link: as Bazel is currently, the dependencies of a configured target need to be determined before any of them is analyzed [a], which poses a problem: the cc_binary would need to know all the link_extra_lib labels from its transitive closure in order to be able to depend on them, but it can't know them before analyzing its transitive closure.

Unfortunately, I no solution to this problem comes to mind immediately: ultimately, you want Bazel to allow a dependency cycle between (e.g.) the logging and string libraries and Bazel doesn't admit circular dependencies. Maybe there is a very tricky way to make this happen, but it's not immediately obvious to me.

[a]: this is only mostly true; config_settings are analyzed before every other dependency so that select() statements can be evaluated, but it's a very special case.