bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.21k stars 4.06k forks source link

Migrate ObjcProvider linking info to CcLinkingContext #16939

Open googlewalt opened 1 year ago

googlewalt commented 1 year ago

Migrate ObjcProvider linking info to CcLinkingContext

Overview

We are planning to migrate the linking info that is currently in ObjcProvider to the CcLinkingContext in CcInfo. This should reduce Bazel memory consumption for Objective-C builds, reduce technical debt and maintenance cost, and unblock future linking improvements.

Motivation

Objective C/C++ (ObjC) are dialects of C/C++, and their Bazel support should share much of the implementation. Historically, ObjC started as a separate implementation, and led to much bespokeness, missing functionality, duplication of functionality, and maintenance burden. Migrating ObjC to use CcInfo to carry its build information will be a major step toward reducing that bespokeness and maintenance cost.

A previous effort migrated the compilation portion of ObjcProvider to CcInfo. This project completes the migration by migrating the linking portion. There is still a small remaining part of ObjcProvider that does not fit into either category, but migrating the linking info will get us much closer to getting rid of ObjcProvider altogether. See below for what remains in ObjcProvider after this migration.

The migration is expected to provide the following benefits:

Goals

  1. Minimize user disruption. We will provide an incompatible flag for transitioning between old and new behavior, and provide guidance for changes required for migration (though there are no plans for automated tools).

  2. Maintain performance parity. In actuality, we expect memory improvements from more efficient generating of CcLinkingContext in objc_library, and from eventually deleting 2/3 of the fields in ObjcProvider.

  3. Avoid adding bespoke, Objective-C-only functionality to CcInfo.

  4. Avoid behavioral changes. For issues where (3) and (4) conflict, try to resolve in favor of (3) because it gets us closer to the desired end state.

Non-goals

  1. Migrating bespoke ObjC linking functionality. Outside of the information carried by the provider, there is still much bespokeness to how ObjC linking is done. This migration will not attempt to redesign any of that. In particular, we will keep much of CompilationSupport.registerLinkAction and the linking variables in ObjcVariablesExtension.

    Even though this item is out of scope of the current migration, it is in the scope of followup work.

Design

Overview

This section gives a brief overview of how each linking field in ObjcProvider will be migrated. A more detailed discussion on various issues are deferred to the Issues section.

Library fields

ObjcProvider has five fields that store the libraries to be linked. Much of the distinction is not important, and we plan to migrate all of them as LibraryToLinks in the LinkerInputs of the CcLinkingContext.

The difference between LIBRARY, CC_LIBRARY, IMPORTED_LIBRARY is where the library comes from: LIBRARY is from objc_library, CC_LIBRARY is from cc_library, and IMPORTED_LIBRARY is from objc_import. These differences do not affect how they are linked. They are all static libraries, and they can be migrated as LibraryToLinks in a LinkerInput.

STATIC_FRAMEWORK_FILE and DYNAMIC_FRAMEWORK_FILE are user-provided Apple frameworks. They are each converted to a framework path (-F) and framework name (-framework) for linking. We plan to migrate these to be regular LibraryToLinks in LinkerInputs as well. This migration does lead to a couple behavioral changes in regards to link ordering and framework deduplication, which will be discussed below.

SDK-related fields

These are fields used to refer to libraries and frameworks to be linked in from the Apple SDK. They turn into linker flags, and we plan to migrate them as such, to userLinkFlags in LinkerInputs of the CcLinkingContext.

Other linking fields

These linking-related fields all have obvious landing spots in CcLinkingContext:

FORCE_LOAD_LIBRARY corresponds directly to alwaysLink in a LibraryToLink.

LINKOPT corresponds to userLinkFlags of a LinkerInput. This migration would fix a long time bug, where the representation of linker options as strings in ObjcProvider can cause incorrect deduplication. For example, -framework Foo and -framework Bar would get folded into -framework Foo Bar. CCLinkingContext’s LinkerInput does not suffer from this issue.

LINK_INPUTS corresponds to nonCodeInputs of a LinkerInput.

LINKSTAMP was added as a short term hack to propagate linkstamps within an objc_library, and it corresponds to linkstamps of a LinkerInput.

Non-linking fields

The following fields may look like linking fields, but are not related to generic linking and will not be migrated:

State of ObjcProvider after migration

Here are the remaining fields in ObjcProvider once the migration is finished.

Fields that can probably go away:

  • FLAG: This field is used to indicate whether C++ linking is required. We may be able to get this information elsewhere, or implement it some other way.
  • J2OBJC_LIBRARY: See above.
  • MODULE_MAP: This is a legacy field back when we had half-baked support for implicit modules. I think it is no longer useful and can be deleted.
  • UMBRELLA_HEADER: This field is used by (internal-only) j2objc_library, and will either become obsolete or be migrated to a j2objc-specific provider.

Fields that still remain:

  • STRICT_INCLUDE: This field is used by (internal-only) objc_proto_library.
  • SOURCE: This field provides source file information used by IDEs.

Issues

This section describes the major migration issues in detail.

  1. Link ordering

    The migration will not preserve the order in which libraries appear in the link arguments. This is not expected to cause any general problems (but note framework linking below), because Bazel currently does not maintain library link ordering in any meaningful way. In the current ordering, forced linked libraries are separated out while the rest of the libraries are put in a filelist according to some arbitrary iteration order of an ImmutableSet. As a result, link ordering may change from unrelated build graph changes or between different Bazel versions.

    Note that this link ordering works together with how ld64 resolves symbols. Unlike a typical linker on Linux, ld64 always begins searching for a symbol from the first library in the link arguments, rather than starting at the library with the undefined symbol reference. This algorithm makes the ordering of link arguments less important.

    After migration, the link ordering of libraries will still be similarly implemented: forced link libraries are separate from the non-forced link libraries, and they are not guaranteed to be in any order. Eventually, we plan to order the libraries properly, in accordance with the partial order imposed by the library dependencies.

  2. Framework linking

    Bazel supports user-provided frameworks. They come from apple_dynamic_framework_import and apple_static_framework_import rules, are stored as DYNAMIC_FRAMEWORK_FILE and STATIC_FRAMEWORK_FILE in ObjcProvider, and are converted to framework search paths (-F) and framework names (-framework) for linking.

    There is no natural representation for frameworks in CcInfo. Instead, we plan to migrate them to be treated as normal libraries: store them as dynamic or static libraries in LibraryToLink, and link them directly via the full path. This allows the migration to be done without adding any bespoke functionality, but it does introduce a couple changes in behavior that requires build cleanup:

    • The framework method of linking acts as a way of deduplication. If the same framework is in two framework search paths, only the first one found will be used. With the proposed migration, both frameworks would be linked, typically leading to a bunch of duplicate symbol errors.

      In practice, having two frameworks in the same build violates the one-definition rule and is an error that we actually want to detect, so the migration actually gets us the behavior we want. This error should be fixed by removing all but one of the redundant frameworks.

    • The link ordering for user-provided frameworks would change.

      Currently, user-provided frameworks are grouped together and placed near the end of the linker command line, after all the non-framework libraries. So if there are common symbols defined in both a framework and a non-framework library, the non-framework library would be chosen to be linked in.

      After the migration, frameworks are treated like normal libraries and may appear before normal libraries in the link ordering. It thus becomes possible for framework symbols to be linked in, in preference to those from non-framework libraries.

      During internal testing, we have found that some apps would only build correctly if frameworks were specified at the end, due to symbols that appear on both frameworks and non-framework libraries. This issue is pernicious because it manifests as run-time failures or crashes.

      We expect this issue to be worse internally than in Bazel, because Google apps and frameworks live in the same monorepo and may be share sources. Custom frameworks that are built from separate sources and have reasonably named exported symbols should not have this issue.

    One thing the proposed change should not affect is run-time behavior. The -F/-framework flags are strictly for finding libraries during linking. They do not introduce any extra runtime search paths into libraries.

  3. Starlark access to SDK fields

    The following linking related SDK fields would no longer be conveniently accessible via Starlark:

    • SDK_FRAMEWORK
    • WEAK_SDK_FRAMEWORK
    • SDK_DYLIB

    Currently, these fields are directly stored in ObjcProvider as NestedSets, and they have Starlark APIs to access them. In CcInfo, they would be stored as userLinkFlags of individual LinkerInputs. We can write Starlark convenience functions that iterate through the userLinkFlags to compute the values of those fields. Since we expect these fields to be accessed infrequently for tasks related to the final link (as opposed to for every compile or archive action), it should not matter that the convenience functions are less efficient. Reference copies of some of these functions can be found in rules_apple.

    It is possible for the new convenience functions to return more frameworks or libraries than currently. This can happen with explicit -framework/-weak_framework/-l linkopts from cc_library. The current code inspects cc_library dependencies and converts such linkopts to SDK_FRAMEWORKs in ObjcProvider, but it doesn't do that conversion for WEAK_SDK_FRAMEWORKS or SDK_DYLIBS. However, in the typical case where the frameworks and libraries come from the SDKs, the convenience functions after migration should be more complete and correct. Internally, I believe this issue only affects rules that validate the contents of SDKs, where the expected contents need to be updated.

  4. SDK fields deduplication

    In ObjcProvider, each of the SDK fields is stored in its own NestedSet, which has the effect of deduplicating it. Migrating them to linkopts in CcInfo loses that deduplication, and can lead to a single -framework <framework> or -weak_framework <framework> linkopt appearing many times.

    This issue will be addressed as follows:

    • We can implement deduplication in the Starlark convenience function mentioned in (3) easily enough. We can also add deduplication to the native code that does ObjC linking.

    • Internally, we are migrating to a model where SDK frameworks are modeled explicitly as targets in the build graph. A library that depends on an SDK framework needs to put the corresponding SDK framework target in its deps. This replaces the old way of using sdk_framework and weak_sdk_framework attributes to specify those dependencies. In this approach, all the dependencies to a framework would then point back to the one target that represents it, and the -framework linkopt would only be provided once by that target.

    We hope that in the long run, Bazel users can set up something similar by scanning the SDKs and setting up framework targets that propagate the linking info. This solution would not handle weak_sdk_framework deduplication, but they are much less commonly used.

Prerequisites

This section describes some requirements for the migration that need to be addressed prior to the migration. They have all been completed in the recent months.

Allow libraries to have no extension on Apple platforms

User-provided Apple frameworks have paths of the form <path>/<name>.framework/<name>, where has no extension. The C++ APIs did not allow libraries to have no extension.

Proposal: Allow libraries to have no extension. (See commit.)

Excessively long dynamic library symlink paths

Bazel symlinks to all the dynamic libraries in a canonical location to reduce the number of rpaths that are needed on Linux. When migrating some Apple frameworks, this ends up creating file names that are too long.

Proposal: Hash down the excessively long library names. (See commit.)

Augment avoid_deps-related providers with CcInfo

avoid_deps is a Bazel mechanism used by Apple builds to factor out duplicate code between the main app and its dynamic libraries. The dynamic libraries are specified as avoid_deps of the main app, and Bazel "subtracts" their common library dependencies from the main app. In order to do so, several native providers need to contain linking info – so for the migration, they need to be augmented to carry CcInfo. The affected providers are:

We would need to add a CcInfo field and a corresponding accessor function (see commit). To avoid breaking existing Starlark code, we can initially allow CcInfo to be null, then remove that once the Starlark rules are updated.

Migration steps

Native Code Migration

Native code migration. Here are the steps involved in migrating the natve Bazel code.

  1. Implement prerequisite functionality.

  2. Migrate link info definitions. Migrate all rules so that they provide linking info in CcInfo. During this phase, the linking info will be redundantly carried in both CcInfo and ObjcProvider.

    This migration should be relatively straightforward, following the recipe for how individual fields should be migrated. One note: until recently, the objc_library implementation calls the C++ build API that returns a proper CcLinkingContext, but the implementation then throws away that CcLinkingContext and rebuilds one from scratch that (1) is incomplete (2) is inefficiently generated. The migration will properly use the CcLinkingContext from the C++ build API (see commit, which will improve both correctness and performance.

  3. Migrate link info uses. Migrate all rules so that they use the linking info in CcInfo, instead of in ObjcProvider. We will provide an incompatible flag --incompatible_objc_linking_info_migration that controls where builtin rules get their linking info.

  4. After a period of time (1-2 months), delete old old linking info implementation and the --incompatible_objc_linking_info_migration flag.

  5. Delete old linking info. Migrate all rules so that they no longer generate linking info in ObjcProvider.

  6. Delete old linking info APIs. Implement --incompatible_objc_provider_remove_linking_info that disallow usage of old linking info APIs in ObjcProvider, AppleDynamicFrameworkInfo, and AppleExecutableBinaryInfo. After a period of time (1-2 months), delete --incompatible_objc_provider_remove_linking_info and all the APIs it guards.

Bazel Migration

Here is the migration recipe for Bazel users.

Rules Migration

Bazel users need to migrate their custom Starlark rules.

  1. A custom ObjC rule that generates ObjcProvider with linking info needs to generate a CcInfo with the same linking info.

  2. A custom ObjC rule that generates one of the providers used by avoid_deps needs to be modified to propagate an appropriate CcInfo.

    Once (1) and (2) are complete, users can flip --incompatible_objc_linking_info_migration=true.

  3. A custom ObjC rule that uses the linking info in ObjcProvider should migrate to use CcInfo instead. Note some properties of libraries that do not affect how they are linked are no longer preserved in CcInfo (e.g. LIBRARY vs CC_LIBRARY vs IMPORTED_LIBRARY).

    For sdk_frameworks, weak_sdk_frameworks, and sdk_dylibs, they need to be computed from CcInfo in Starlark, as seen in rules_apple.

  4. For cleanup, delete all the linking info in ObjcProvider, as well as the objc field in AppleDynamicFrameworkInfo and AppleExecutableBinaryInfo. Once this step is done, users can flip --incompatible_objc_provider_remove_linking_info=true.

Behavioral changes

Bazel users also need to be aware of behavioral changes that may cause build breakages. The breakages are expected to be from multiply defined symbols that are newly exposed by the migration:

keith commented 1 year ago

User-provided static frameworks will appear alongside regular libraries in the link ordering, instead of always at the end.

note there is one caveat to this which causes issues:

https://github.com/apple/swift/issues/61287

keith commented 1 year ago

We just got hit by the bug I linked above again. I definitely don't like having to depend on linking order, but given that the Apple folks are relying on that for this behavior, I wonder if it makes sense to try to do the same for bazel for precompiled libraries, wdyt @googlewalt

googlewalt commented 1 year ago

Hmm that is quite unfortunate. It seems that this is a bug that Apple needs to address, or at least give clarity to what the issue is. Linking frameworks last does not fully address the issue, because an app could be using some frameworks built with 14.3.1 and some built with 15.0.

(mistakenly replied to the swift thread earlier).

keith commented 1 year ago

Linking frameworks last does not fully address the issue, because an app could be using some frameworks built with 14.3.1 and some built with 15.0.

Yes. but I think this solution is still the best since it's what Xcode is doing, and I think it would be more common to have a vendored framework that was built with an older version of Xcode than one that was built from a newer version than you are currently using

googlewalt commented 1 year ago

I don't love the idea adding code that puts the framework last in link ordering. It makes objc linking more bespoke and gets us further away from the goal of unifying objc support with C++.

One issue that prevents us from even having a workaround for such linking order bugs is that we currently don't make any guarantees about link ordering. What if we fix that to be more like C++, so that a library appears before its dependencies in link ordering. That way, at least we can work around the issue by adding a dependency.

keith commented 1 year ago

What if we fix that to be more like C++, so that a library appears before its dependencies in link ordering. That way, at least we can work around the issue by adding a dependency.

That would definitely fix some cases, but I guess it wouldn't fix all, since your dependency could have a symbol that conflicts with that of another library in an unrelated portion of the tree, and that ordering would still affect things. I'm not sure if it would be an improvement or not since it might be worse if this was more subtle

keith commented 1 year ago

One issue that prevents us from even having a workaround for such linking order bugs is that we currently don't make any guarantees about link ordering.

Yea, I'm surprised that others in normal C++ don't have similar issues because of this? I guess the problem is that when you identify the issue as this you don't have an easy workaround

googlewalt commented 1 year ago

What if we fix that to be more like C++, so that a library appears before its dependencies in link ordering. That way, at least we can work around the issue by adding a dependency.

That would definitely fix some cases, but I guess it wouldn't fix all, since your dependency could have a symbol that conflicts with that of another library in an unrelated portion of the tree, and that ordering would still affect things. I'm not sure if it would be an improvement or not since it might be worse if this was more subtle

I think in general having a symbol defined in multiple places is an ODR violation and an error. It wouldn't be a supported use case regardless, but it would be especially broken w.r.t bazel objc linking because objc linking doesn't guarantee link ordering.

I am not seeing why this suggestion would make the error more subtle. To be clear the workaround might not be pretty -- we'd potentially need to add a depencency from every library to the problematic framework, but maybe that can be done via a common existing starlark macro, and at least it's possible.

One issue that prevents us from even having a workaround for such linking order bugs is that we currently don't make any guarantees about link ordering.

Yea, I'm surprised that others in normal C++ don't have similar issues because of this? I guess the problem is that when you identify the issue as this you don't have an easy workaround

For cc_binary, the link ordering is emitted in dependency order. Historically this was necessary for gnu ld (if the linker encounters an unresolved symbol in an object file, it only searches for it in archives to the right of the object file in the commandline), though new elf linkers (e.g. lld) no longer requires that.

jerrymarino commented 1 year ago

There’s a hard ordering requirement for the Apple platforms and ecosystem deps in it, a number of the static libs will only work with Xcodes order. It’s the linkers, sdks, and static libs combined who require the bespoke ordering.

This said, we’re now popping sdk_dylibs off the ObjcProvider in the ios rules to correctly link on Bazel 6: I wonder if there a way to make a variant of the change and keep ordering working? Some of the proposals/ideas here seem really good from what I imagined. We can potentially hack around it in starlark if not

googlewalt commented 1 year ago

Correction. For the most part, objc linking does preserve dependency ordering when linking. However, out of necessity of response file limitations, it has to split the inputs into those that are -force_load and those that are not, so ordering is only preserved among those sub-groups.

But as mentioned, dependency order as a workaround is a pain because it requires inserting dependency from everything to the problem framework. Instead, how about putting the framework in linkopts. The crosstool puts both filelist and force_load libraries before linkopts, thus this gets the desired ordering.

Here is a patch to rules_apple that would implement this change in apple_static_framework_import (I only did it for static imports; can extend to dynamic imports if needed).

static_import.patch

googlewalt commented 1 year ago

@jerrymarino Yeah it does seem like the crosstool ordering has changed, probably inadvertently. The old ordering was frameworks, weak_frameworks, then sdk_dylibs. You can try submitting a change to rules_apple to restore the old order.

FWIW internally our full ordering was: filelist, force_loads, frameworks, weak_frameworks, sdk_dylibs. This is different from the OSS ordering which puts the sdk dependencies first.

keith commented 1 year ago

FWIW internally our full ordering was: filelist, force_loads, frameworks, weak_frameworks, sdk_dylibs. This is different from the OSS ordering which puts the sdk dependencies first.

This is the ordering we now mirror, but also this no longer matters as part of this migration since those fields are mostly not set by newer versions of the rules