python / cpython

The Python programming language
https://www.python.org
Other
63.84k stars 30.56k forks source link

macOS linker warnings in macOS ventura #97524

Open pablogsal opened 2 years ago

pablogsal commented 2 years ago

Turns out ld shipped with Xcode 14+ emits a warning when using -undefined dynamic_lookup.`

ld: warning: -undefined dynamic_lookup may not work with chained fixups

Some investigation reveals that in fact -undefined dynamic_lookup doesn't work when:

This is because -undefined dynamic_lookup uses lazy-bindings and they won't be bound while dyld fixes-up by traversing chained-fixup info.

However, we build extension modules with as bundles (-bundle) and they are loaded only through dlopen, so it's safe to use -undefined dynamic_lookup in theory. So the warning produced by ld64 is likely to be a false-positive.

ld64 also provides the -bundle_loader <executable> option, which allows resolving symbols defined in the executable symbol table while linking. It behaves almost the same with -undefined dynamic_lookup, but it makes the following changes:

See "New Features" subsection under "Linking" section for chained fixup developer.apple.com/documentation/xcode-release-notes/xcode-13-release-notes for more information.

Also, check the same problem in ruby where most of this information is taken from.

pablogsal commented 2 years ago

CC @ambv @ned-deily

pablogsal commented 2 years ago

Unfortunately this warning also propagates to all extension modules built with the given Python.

ned-deily commented 2 years ago

@ronaldoussoren

ronaldoussoren commented 2 years ago

-bundle_loader doesn't work with shared library builds (including the framework build), as the option does exactly as described in the documentation: It looks for symbols in the loader executable, not in shared library dependencies of that binary. I found this when looking into -bundle_loader to get link time errors when using non-existing symbols instead of only getting those when using an extension.

Linking with libpython is also problematic because on macOS shared libraries are referenced by absolute path which means linking with libpython ties the extension to a specific interpreter (e.g. makes it impossible to build an extension using the Python.org installer and load it using a homebrew install). For the user this will result in seemingly random crashes (due to libpython being loaded twice while only one is initialised).

I haven't spend time on looking into this particular issue yet, with some luck there's another option beyond disabling chained lookups.

CsBigDataHub commented 2 years ago

FYI.. https://issues.guix.gnu.org/issue/57849

Looks like using -Wl,-w will help.

ronaldoussoren commented 2 years ago

FYI.. https://issues.guix.gnu.org/issue/57849

Looks like using -Wl,-w will help.

That doesn't really help, it just suppresses the warning.

keith commented 2 years ago

You can pass -Wl,-no_fixup_chains to disable them entirely for now, but that flag is undocumented and who knows if it will exist in the future.

pablogsal commented 2 years ago

You can pass -Wl,-no_fixup_chains to disable them entirely for now, but that flag is undocumented and who knows if it will exist in the future.

It will likely won't because chained fixups are the new linker information for the dynamic symbol tables. So we will be just delaying the inevitable

ronaldoussoren commented 2 years ago

Another workaround is to target an older deployment target (before 11.0), but that's also not a long term solution.

ronaldoussoren commented 2 years ago

For static builds using -Wl,-bundle_loader,$(BUILDPYTHON) -Wl,-undefined,error appears to work. I've manually patched the Makefile for some extensions and that does result in a build without warnings and I can load those extensions. What I've not checked yet is if those extensions work with a python installed into a different prefix.

This doesn't work with dynamic builds though, even when using BLDLIBRARY= -L. -Wl,-reexport-lpython$(LDVERSION) when linking $(BUILDPYTHON).

Note that just linking extensions with libpython isn't a realistic option because that results in C extensions that target a specific python prefix and hence would loose interoperability for wheels build with our installer and homebrew.

ronaldoussoren commented 2 years ago

What does work for a dynamic build (again with a tweaked $(BLDLBIRARY): -Wl,-flat_namespace -Wl,-bundle_loader,./python.exe -Wl,-undefined,error, that is add -Wl,-flat_namespace. That's not ideal though because the default two-level namespaces are more efficient (and faster).

Something similar should also work for framework builds (but using a different reexport flag), but I haven't tested this yet. Mostly because I'm not happy with the link flags I've used for a shared build.

All of this was tested by manual patching of the Makefile, I haven't looked into tweaking the configure script and makefile template yet.

wjakob commented 2 years ago

Hi — this issue has also been raised in the pybind11 and CMake projects, including a fix similar to what was suggested above by @ronaldoussoren.

Links:

All of this raises a few questions (https://github.com/pybind/pybind11/pull/4301#issuecomment-1298210494) that would be good to figure out together with somebody who is familiar with the intricacies of Darwin's dynamic linker.

pablogsal commented 2 years ago

Hi @wjakob,

We are currently waiting for some clarifications from Apple. Will follow up with anything we find.

mathstuf commented 2 years ago

Anyone with Apple clout could push this PR into their attention sphere: https://github.com/apple-opensource/ld64/pull/1

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

wjakob commented 2 years ago

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

Correct me if I'm wrong, but it seems to me like this PR targets a different use case: it enables dynamic_lookup selectively for chosen symbols to avoid the "sledgehammer" solution of changing the defaults for all undefined symbols.

While the issue discussed here is that the very mechanism that drives dynamic_lookup no longer works in binaries using the new "chained fixups" linker feature (which is automatically enabled for newer macOS deployment targets).

wjakob commented 2 years ago

One more data point in case it is useful: explicitly enumerating all relevant Python symbols using the -U command line arguments (https://github.com/wjakob/nanobind/commit/bb211d1889bf18c762510b22df706aecb3d51f8a) also works and the warnings are gone. But it leaves me wondering:

carlocab commented 2 years ago

I could not find a -U_from_file <filename> argument to batch-inform ld about the symbols to accept as being undefined.

ld response files might be useful here (though it is documented under Rarely used Options):

     @response_file_path
             Inserts contents of file at response_file_path into arguments. This allows for linker command line args to be store in a
             file.  Note: ld is normally invoked through clang, and clang also interprets @file on the command line.  To have clang
             ignore the @file and pass it through to ld, use -Wl,@file.
mathstuf commented 2 years ago

While the issue discussed here is that the very mechanism that drives dynamic_lookup no longer works in binaries using the new "chained fixups" linker feature (which is automatically enabled for newer macOS deployment targets).

Maybe? But if -U works, that's what this should do (IIRC it is, but it has been a long time since I poked it). Of course, the code is also unbuilt because…I have no idea what the instructions are in the first place.

pablogsal commented 2 years ago

Anyone with Apple clout could push this PR into their attention sphere: apple-opensource/ld64#1

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

I am not sure that anyone at apple monitors that repo. If you check it has no pull requests and nobody ever answered an issue

mathstuf commented 2 years ago

Yeah, I'm aware. I had contacts at Apple, but they've since moved elsewhere…

wjakob commented 2 years ago

I did a bit more investigation with the -U linker argument that enables specifying individual symbols as undefined. The gist is that despite not showing the ominous warning about "chained fixups", it seems to produce shared libraries with the exact same binding table entries as the previous standard solution (-undefined dynamic_lookup).

Links:

The third option being discussed (-undefined suppress -flat_namespace) apparently has severe drawbacks that were pointed out by others in the discussion thread.

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

pablogsal commented 2 years ago

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

Yeah, we are unlikely to proceed with anything in CPython unless we have direct authoritative confirmation from Apple that everything is working as we expect and we are not going to have problems down the line.

ronaldoussoren commented 2 years ago

Hi @wjakob,

We are currently waiting for some clarifications from Apple. Will follow up with anything we find.

My first attempt at contacting Apple failed. Asking developer support resulted in getting forwarded to the forumes to ask about system extensions...

I'll try again after November 14th, before that my agenda is too full to quickly respond.

ronaldoussoren commented 2 years ago

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

My current plan is to try to contact someone from Apple (if he still works there), and if that fails use one of the code-level support tickets included in a paid developer subscription.

wjakob commented 2 years ago

Update: I tweeted about this issue to draw some attention to it and was in return asked to file an official bug issue (mirror: https://openradar.appspot.com/radar?id=5536824084660224)

ronaldoussoren commented 2 years ago

One thing that can work for shared library builds (including frameworks) is to start linking extensions to libpython, but ensure that @rpath/libpythonX.Y.dylib is used in the link command (while adding an LC_RPATH to the interpreter binary). That won't work for static builds though for obvious reasons.

A bit hand wavy because I haven't actually tested this or thought trough its implications.

isuruf commented 2 years ago

That should work with LC_RPATH with an @executable_path/../lib or similar if the python binary is not linked statically.

That won't work for static builds though for obvious reasons.

This is an important use-case though. conda python binaries link statically and would segfault if libpythonX.Y.dylib is loaded.

mathstuf commented 2 years ago

Rpath entries are transitive, so you could add an LC_LOAD_DYLIB @rpath/libpythonX.Y.dylib command and just expect the executable to provide it (conda could provide an empty dylib for these purposes) without any LC_RPATH command at all.

Note that I believe that this approach still doesn't fix the "please don't complain about missing symbols" problem that this all started out with.

ronaldoussoren commented 2 years ago

That should work with LC_RPATH with an @executable_path/../lib or similar if the python binary is not linked statically.

That won't work for static builds though for obvious reasons.

This is an important use-case though. conda python binaries link statically and would segfault if libpythonX.Y.dylib is loaded.

I know. So far I've only been able to find partial fixes that work with some of the use cases but not all:

Ideally we'd end up with a solution that works with -undefined error because that gives a better development experience when working on C extensions.

wjakob commented 2 years ago

None of the workarounds mentioned in the last few posts are truly general.

For example, in Blender, everything including Python is part of a single binary Blender.app/Contents/MacOS/Blender. It does not include a separate libpython.dylib.

That's why I think the ball is now in @apple's court. They either need to stabilize the existing mechanism or propose a new one.

ronaldoussoren commented 2 years ago

https://developer.apple.com/forums/thread/702028 has an option to disabling fixup chains in one of the replies: -Xlinker -no_fixup_chains. That's an undocumented option though, which is worrisome as such an option may well be removed in a future version.

I've posted a message about this issue on the Apple Developer Forums, this will be at https://developer.apple.com/forums/thread/719961 when the message passes moderation.

Contacting someone at apple failed outside regular channels failed so far.

carlocab commented 2 years ago

I've had an update from a contact at Apple, and the advice I've been given is to look out for an update in the release notes of an upcoming SDK.

I'd like to assume that that's the next release, but it may not be. I guess the point is (interpretation mine) that the issue is on Apple's radar and they plan to ship a fix eventually.

wjakob commented 2 years ago

Quinn responded to the developer forums post by @ronaldoussoren (https://developer.apple.com/forums/thread/719961) and also mentioned that I'll receive an email notification if/when an updated XCode build addressing FB11767124 becomes available. I will post any further feedback from Apple here when it becomes available.

ronaldoussoren commented 1 year ago

FWIW: The release notes for Xcode 14.2 are very short and don't mention anything related to this issue.

ned-deily commented 1 year ago

FWIW: The release notes for Xcode 14.2 are very short and don't mention anything related to this issue.

Yes, there doesn't seem to be any change when building with 14.2.

wjakob commented 1 year ago

I have also not heard back via issue FB11767124.

wjakob commented 1 year ago

Dear all,

I heard back from feedback report FB11767124 (open radar). The situation can be summarized as follows:

  1. Combining -undefined dynamic_lookup flag with a macOS deployment target of => 12.0 is indeed problematic. The warning is not a fluke.

  2. A new Xcode beta version (14.3, build 14E5197f) was just released, which addresses the problem. The release notes unfortunately don't mention what changed, but you will notice that the warning

    ld: warning: -undefined dynamic_lookup may not work with chained fixups

    disappears when using this XCode build.

    This is because a updated dynamic linker in this build disables fixup chains when the -undefined dynamic_lookup flag is specified. (i.e. it isn't merely silencing the warning.)

  3. On older versions of XCode, -Wl,-no_fixup_chains must be specified to enforce this behavior.

Please let me know if you have any questions, and I can follow up on the feedback report.

pablogsal commented 1 year ago
  • This is because a updated dynamic linker in this build disables fixup chains when the -undefined dynamic_lookup flag is specified. (i.e. it isn't merely silencing the warning.)

What are the consequences of this? Is this something that we can rely to keep working in the future?

pablogsal commented 1 year ago
  • Combining -undefined dynamic_lookup flag with a macOS deployment target of => 12.0 is indeed problematic. The warning is not a fluke.

Also, do we know what exactly are the problems?

wjakob commented 1 year ago

What are the consequences of this? Is this something that we can rely to keep working in the future?

"Chained fixups" encode relocation metadata for the dynamic linker more compactly. Disabling it causes the Python extension module to become larger. People report massive impact in Swift projects (see, e.g., here). For more information on the optimization, take a look at this WWDC video around 20mins into the presentation. In my own extensions, I found the effect to be surprisingly small (~1%). My hypothesis is that Python extensions just don't import/export that many symbols compared to a typical Swift program.

Also, do we know what exactly are the problems?

Agreed, it's surprising that somehow things still seem to work okay. If there is some undefined behavior/corner case, it's definitely not something that happens very often. I could also imagine that this is currently just supported by fallback code subject to removal in a future macOS version. I will bring it up in the feedback issue.

pablogsal commented 1 year ago

I could also imagine that this is currently just supported by fallback subject to removal in a future macOS version. I will bring it up in the feedback issue.

Yeah, this is exactly what worries me about this solution: the fact that the old behaviour we are going to rely on is going to be deprecated or removed. And given the transparency (or lack thereof) of this process I am a bit afraid that this happens without enough time for us to raise it again.

mathstuf commented 1 year ago

My hypothesis is that Python extensions just don't import/export that many symbols compared to a typical Swift program.

My understanding is that Python extensions only need to export a single symbol: the initmodule that Python looks up. Everything else can be hidden.

ronaldoussoren commented 1 year ago

Also, do we know what exactly are the problems?

Agreed, it's surprising that somehow things still seem to work okay. If there is some undefined behavior/corner case, it's definitely not something that happens very often. I could also imagine that this is currently just supported by fallback code subject to removal in a future macOS version. I will bring it up in the feedback issue.

Could you also explicitly ask for advise on how to accomplish our goals in a future proof way?

ronaldoussoren commented 1 year ago

My hypothesis is that Python extensions just don't import/export that many symbols compared to a typical Swift program.

My understanding is that Python extensions only need to export a single symbol: the initmodule that Python looks up. Everything else can be hidden.

It's the imports that are a problem, C extensions need to import the symbols in libpython without explicity linking to it (see previous discussion on why we do this).

wjakob commented 1 year ago

Dear all,

after requests for more information regarding chained fixups, -undefined dynamic_lookup and the precise failure mode of the combination, I received a detailed technical response from Apple. I think that it not only clarifies the issue but also seems like good news overall.

Let me first post the message I received in verbatim, with my interpretation below:

In Xcode 14.3 -undefined dynamic_lookup automatically disables chained fixups, thus the warning is gone. The -U options remains unchanged; it can be combined with chained fixups. The behavior of chained fixups and dynamic lookup is deterministic, but more restrictive than previous bind opcodes and so is risky to use with dynamic lookup for any symbol. We do want to allow developers to leverage chained fixups while still using dynamic lookup for selected symbols though.

If you prefer to use -U, but can’t use chained fixups then you can use the -no_fixup_chains option, although it is a subject to change in future releases. Alternatively, you could use -undefined dynamic_lookup and instead verify that only allowed symbols use dynamic lookup, e.g., using the dyld_info tool:

$ dyld_info -fixups libfoo_user.dylib
libfoo_user.dylib [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x00004000              bind  flat-namespace/_foo
_foo symbol here uses dynamic lookup in the flat namespace.

A practical difference between chained fixups and bind opcodes is that symbols can’t be bound lazily, so you need to make sure all symbols a library with dynamic lookup depends on are available at the time the library is loaded. Here’s an example:

$ cat Makefile
all:
  cat foo.c
  cat foo_user.c
  cat main.c
  clang foo.c -dynamiclib -o libfoo.dylib ${LDFLAGS}
  clang foo_user.c -dynamiclib -o libfoo_user.dylib -Wl,-undefined,dynamic_lookup ${LDFLAGS}
  clang main.c ${LDFLAGS}
$ make
cat foo.c
int foo() { return 0; }
cat foo_user.c
int foo();

int foo_use()
{
  return foo();
}
cat main.c
#include <dlfcn.h>
#include <string.h>
#include <stdio.h>

typedef int(*FooUseFn)();

int main(int argc, char** argv)
{
  if ( argc < 2 ) {
    printf("use: %s [before|after]\n", argv[0]);
    return 1;
  }
  bool before = strcmp(argv[1], "before") == 0;
  if ( before ) {
    dlopen("libfoo.dylib", RTLD_NOW);
  }
  void* handle = dlopen("libfoo_user.dylib", RTLD_LAZY);
  if ( !handle ) {
    printf("couldn't open libfoo_user.dylib: %s\n", dlerror());
    return 1;
  }
  if ( !before ) {
    dlopen("libfoo.dylib", RTLD_NOW);
  }
  FooUseFn fooUse = (FooUseFn)dlsym(handle, "foo_use");
  printf("foo_use(): %d\n", fooUse()) ;
}
clang foo.c -dynamiclib -o libfoo.dylib
clang foo_user.c -dynamiclib -o libfoo_user.dylib -Wl,-undefined,dynamic_lookup
clang main.c
$ ./a.out before
foo_use(): 0
$ ./a.out after
couldn't open libfoo_user.dylib: dlopen(libfoo_user.dylib, 0x0001): symbol not found in flat namespace '_foo'

But, if you’d disable chained fixups then lazy symbols can be bound later when used, so loading libfoo.dylib after libfoo_user.dylib will work:

$ make LDFLAGS=-Wl,-no_fixup_chains
$ ./a.out after
foo_use(): 0

My interpretation of this message with some extra context is as follows. If you are wondering what chained fixups are, I recommend watching the second half of this video first:

  1. The chained fixups optimization is active by default when targeting macOS 12.0+. While intended as a more efficient way of encoding+applying relocations for imported symbols, there is also a secondary effect: shared libraries use an eager symbol resolution policy. This means that when Python executes dlopen() to import an extension module that happens to use chained fixups, it will immediately fail when any of the unresolved symbols in that extension remain unresolved when cross-referenced against the symbol table of the process, showing the message dlopen(..): symbol not found in flat namespace [..].

  2. Before the chained fixup optimization, this was all handled more lazily. A missing symbol was essentially ignored, and things were fine unless you actually tried calling it (causing the process to abort() with the message dyld[$PID]: missing symbol called).

  3. Apple realized that this behavioral change might be problematic for some users and updated XCode. The new version disables chained fixups and reverts to the previous relocation encoding when the -undefined dynamic_resolve parameter is detected.

  4. Based on the provided information, I conclude that the eager symbol resolution policy is actually harmless for Python. Any normal extension will reference a number of undefined CPython API symbols, but these are all provided by the host process. The problem about missing symbols does not arise.

  5. Previous messages in this ticket mention express concern about potential undefined behavior or expected breakage in a future macOS versions when older extensions compiled with XCode < 14.3 are used. That doesn't seem to be the case based on the provided information. Also, we can compile extensions with or without chained fixups, it's all fine.

  6. The warning on pre-14.3 XCode versions is annoying, and people have experimented with an extra -Wl,-no_fixup_chains parameter to silence it. However, Apple doesn't make gaurantees about the long-term stability of the this parameter (and it is in any case redundant in XCode >= 14.3).

  7. It may actually be desirable to retain chained fixups when building an extension module (small size reduction of the output). To achieve this, list the individual symbols (via -U) in a more targeted way (potentially using a linker response file to avoid having to pass hundreds of command line parameters). This way of linking doesn't trigger warnings and behaves the consistently across XCode versions.

pablogsal commented 1 year ago

Thanks a lot @wjakob for this excellent summary and for helping us gathering the required information. You rock 🤘

wjakob commented 1 year ago

One addendum for posterity: if you want to disable the linker warnings while retaining chained fixups optimization in a future-proof way, try -Wl,@darwin-ld-cpython.sym where darwin-ld-cpython.sym can be downloaded here (automatically assembled from all exports specified in python3dll.c for Python 3.8-3.12 [git]). And here is the analogous list for PyPy.

(This is probably overkill for most projects, just mentioning it in case somebody wants to study the impact compared to -Wl,-no_fixup_chains)

ambv commented 1 year ago

Thanks @wjakob, this is a clear actionable item for us. I think I prefer having an explicit list like this and retaining chained fixups.

However, I imagine we will likely want to create this symbol list ourselves "just-in-time" in our build tooling to future-proof it without creating an external dependency. 3.12 is still in flux so the list might change, and 3.13 is around the corner.

mathstuf commented 1 year ago

A more future-proof way of handling missing symbols would be poking the right people at Apple to support the functionality in this PR which allows one to say "here is a library for which any symbol found should be treated as if given via -U and do not add the library to the load commands".

pablogsal commented 1 year ago

However, I imagine we will likely want to create this symbol list ourselves "just-in-time" in our build tooling to future-proof it without creating an external dependency. 3.12 is still in flux so the list might change, and 3.13 is around the corner.

We can create it and package it (and use when building the interpreter) but extensions need to pick it up using whatever build system they are using, no?