bazelbuild / rules_apple

Bazel rules to build apps for Apple platforms.
Apache License 2.0
504 stars 260 forks source link

Non-deterministic linking when using default (ld64) linker #1906

Open sewerynplazuk opened 1 year ago

sewerynplazuk commented 1 year ago

This is not a Bazel, nor rules apple issue but worth to have it documented (more context here).

With some probability, the ordering of the LC_LOAD_DYLIB commands will vary between the test runs, thus the determinism of test bundles is going to be disrupted.

Repro steps can be found here, although it is possible to reproduce this case without cleaning by relinking by putting something random in the linker arguments.

It should lead to

Load command 27
          cmd LC_LOAD_WEAK_DYLIB
      cmdsize 64
         name /usr/lib/swift/libswiftFoundation.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1.0.0
compatibility version 1.0.0
Load command 28
          cmd LC_LOAD_WEAK_DYLIB
      cmdsize 72
         name /usr/lib/swift/libswiftCoreGraphics.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 120.100.0
compatibility version 1.0.0

these 2 commands switching places between the runs.

It's possible to use custom linker (i.e. lld) as a workaround. Please note that this can be risky for production builds.

I've filed the radar for the issue.

keith commented 1 year ago

I debugged this great repro case some more and reduced it to just this:

ld
-dynamic
-arch arm64
-bundle
-platform_version
ios-simulator
14.2
16.2
-syslibroot
/Applications/Xcode-14.2.0-RC1.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator16.2.sdk
-o
/tmp/abc123
-L/Applications/Xcode-14.2.0-RC1.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/usr/lib
-L/usr/lib/swift
-L/Applications/Xcode-14.2.0-RC1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/swift/iphonesimulator
/tmp/bazel-non-deterministic-test-bundles/FooTests.swift.o
-lSystem
-no_adhoc_codesign
-F/Applications/Xcode-14.2.0-RC1.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/Library/Frameworks

In the latest open source dump from Xcode 13.2.1 the issue appears to be this loop https://github.com/keith/ld64/blob/b0f88eba713daa3ebe75ed7f66f07b4b0cfc3d0e/src/ld/OutputFile.cpp#L4901-L4903 because syntheticDylibs is actually a map https://github.com/keith/ld64/blob/b0f88eba713daa3ebe75ed7f66f07b4b0cfc3d0e/src/ld/OutputFile.cpp#L4857

If you apply this patch (there's probably a more efficient solution out there) to ld:

diff --git a/src/ld/OutputFile.cpp b/src/ld/OutputFile.cpp
index c92313a..9d809b0 100644
--- a/src/ld/OutputFile.cpp
+++ b/src/ld/OutputFile.cpp
@@ -4854,7 +4854,8 @@ uint32_t OutputFile::dylibToOrdinal(const ld::dylib::File* dylib) const

 void OutputFile::buildDylibOrdinalMapping(ld::Internal& state)
 {
-   std::map<const char *,  generic::dylib::File*> syntheticDylibs;
+   std::map<const char *,  generic::dylib::File*> syntheticDylibs2;
+   std::vector<std::pair<const char *,  generic::dylib::File*>> syntheticDylibs;
    std::set<const generic::dylib::File*> potentiallyDeadDylibs;

    // Scan through all import atoms looking for redirects
@@ -4865,12 +4866,13 @@ void OutputFile::buildDylibOrdinalMapping(ld::Internal& state)
            if (!atom || !atom->installname()) { continue; }

            generic::dylib::File* newFile = nullptr;
-           auto i = syntheticDylibs.find(atom->installname());
-           if (i == syntheticDylibs.end()) {
+           auto i = syntheticDylibs2.find(atom->installname());
+           if (i == syntheticDylibs2.end()) {
                auto file = dynamic_cast<const generic::dylib::File*>(atom->file());
                newFile = file->createSyntheticDylib(atom->installname(), atom->compat_version());
-               syntheticDylibs[atom->installname()] = newFile;
+               syntheticDylibs2[atom->installname()] = newFile;
                potentiallyDeadDylibs.insert(file);
+               syntheticDylibs.emplace_back(atom->installname(), newFile);
            } else {
                newFile = i->second;
            }
@@ -4878,6 +4880,10 @@ void OutputFile::buildDylibOrdinalMapping(ld::Internal& state)
        }
    }

+   std::sort(syntheticDylibs.begin(), syntheticDylibs.end(), [](std::pair<const char *, generic::dylib::File*> &left, std::pair<const char *, generic::dylib::File*> &right) {
+       return strcmp(left.first, right.first);
+   });
+
    // CHeck back through the symbol sources to see if anything still points at them, and remove them from potentially dead dylibs
    for (const ld::Internal::FinalSection* sect : state.sections) {
        if ( sect->type() != ld::Section::typeImportProxies) { continue; }

The issue is solved.

Also I can no longer repro this with Xcode 14.3 beta 3, so I think they must have found and fixed this case! 🎉