JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

`Undefined symbols: ___truncsfbf2` when building on macOS #52067

Open eschnett opened 1 year ago

eschnett commented 1 year ago

I am building the current master version of Julia from scratch on macOS (Darwin redshift.pi.local 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:27 PDT 2023; root:xnu-10002.41.9~6/RELEASE_X86_64 x86_64 i386 Darwin). I see this error:

$ make
    LINK usr/lib/libjulia-internal.1.11.0.dylib
ld: Undefined symbols:
  ___truncsfbf2, referenced from:
      _julia__truncsfbf2 in runtime_intrinsics.o
      _julia__truncdfbf2 in runtime_intrinsics.o
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [Makefile:388: /Users/eschnett/src/julia-master/usr/lib/libjulia-internal.1.11.0.dylib] Error 1
make: *** [Makefile:97: julia-src-release] Error 2

I have both GCC and Clang installed via MacPorts.

It seems that Julia uses this Clang (which clang++: /opt/local/bin/clang++). I checked, and it seems that only GCC provides this function (/opt/local/lib/gcc13/gcc/x86_64-apple-darwin23/13.2.0/libgcc.a:truncsfbf2.o: 0000000000000000 T ___truncsfbf2), but Clang does not provide it.

$ clang++ --version
clang version 17.0.4
Target: x86_64-apple-darwin23.1.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-17/bin
gbaraldi commented 1 year ago

They are building their clang wrong :|

oscardssmith commented 1 year ago

By "they", do you mean Apple?

gbaraldi commented 1 year ago

No, I mean macports, I was able to reproduce this locally, so homebrew is wrong here as well

gbaraldi commented 1 year ago

I see https://github.com/llvm/llvm-project/commit/489bda6a9c0ec9d2644b7bb0c230294d38f7296e but I'm not sure @fxcoudert any idea?

maleadt commented 1 year ago

I'm confused that this generates a libcall to __truncsfbf2, as in runtime_intrinsics we do all the conversion ourselves.

fxcoudert commented 1 year ago

homebrew is wrong here as well

I'm puzzled by this statement. I don't see that we do any special treatment for this symbol, or anything related, in our LLVM build. But we do ship LLVM 17, and I seem to find only LLVM 16 on Yggdrasil, so maybe that's a difference in behaviour between those two versions?

gbaraldi commented 1 year ago

I think this is some weirdness happening with compiler-rt + apple. Where they aren't building the bf16 builtins even though it seems they should be built.

fxcoudert commented 1 year ago

I'm not expert in the bf16 type, but the behaviour of the compilers seems consistent:

meau /tmp $ cat a.c
__bf16 intend(float x) {
    return (__bf16) x;
}
meau /tmp $ gcc-13 -c a.c -W && nm a.o                         
0000000000000020 s EH_frame1
                 U ___truncsfbf2
0000000000000000 T _intend
0000000000000000 t ltmp0
0000000000000020 s ltmp1
meau /tmp $ clang -c a.c -W && nm a.o                           
a.c:2:21: error: cannot type-cast to __bf16
    return (__bf16) x;
                    ^
1 error generated.
meau /tmp $ /opt/homebrew/opt/llvm/bin/clang -c a.c -W && nm a.o                         
0000000000000000 T _intend
0000000000000000 t ltmp0
0000000000000018 s ltmp1

So if things worked before, either:

maleadt commented 1 year ago

bf16 intend(float x) { return (bf16) x; }

That's not how our intrinsics convert to __bf16 though, instead, we do the conversion as raw uint16 and cst the pointer: https://github.com/JuliaLang/julia/blob/137783f1663ae0f7c1129c7d8031c874083b49fe/src/runtime_intrinsics.c#L355-L357

Or at least that's the intention.

KristofferC commented 8 months ago

FWIW, I also encountered this after a homebrew upgrade:

❯ clang --version                                             
Homebrew clang version 17.0.6
Target: x86_64-apple-darwin23.3.0
Thread model: posix
InstalledDir: /usr/local/opt/llvm/bin
DilumAluthge commented 8 months ago

So the clang shipped by Homebrew (on macOS) and MacPorts is broken? But what about the clang shipped by Xcode?

Should our build system try to detect if the user has Xcode (or at least the Xcode Command Line Tools) installed, and if so use that clang instead?

KristofferC commented 8 months ago

Changing to clang Xcode made it work. I don't think the MacPorts one is broken but it AFAIU doesn't have support for bfloat16 and we fail to detect that.

gbaraldi commented 8 months ago

The issue is more subtle than that. The issue is that the macports/brew clang support _bf16 and emit their code correctly, but the system/compiler-rt that it uses doesn't have the libcalls that it expects to be there. I'll write a C reproducer.

gbaraldi commented 8 months ago

https://github.com/llvm/llvm-project/pull/84192 should fix it. But it will need to trickle down to distributions. So we probably need to bump the guard for darwin

weedge commented 8 months ago

use Command Line Tools for Xcode compiler clang++ -DCMAKE_CXX_COMPILER=/Library/Developer/CommandLineTools/usr/bin/clang++ downlaod from https://developer.apple.com/download/all/?q=Command%20Line%20Tools%20for%20Xcode

weedge commented 8 months ago

use Command Line Tools for Xcode compiler clang++

-DCMAKE_CXX_COMPILER=/Library/Developer/CommandLineTools/usr/bin/clang++

downlaod from https://developer.apple.com/download/all/?q=Command%20Line%20Tools%20for%20Xcode

fingolfin commented 2 months ago

This issue is currently blocking an update for libjulia_jll on Yggdrasil. I wonder if anyone has a suggestion for a workaround? E.g. perhaps we can #if 0 some section of code there (the generated libraries are not actually used for runtime, just for linking, so the content of functions is generally irrelevant)

gbaraldi commented 2 months ago

This code needs an #ifdef _OS_DARWIN_ until https://github.com/llvm/llvm-project/pull/84192 is part of some release. Then it needs to check that version.

fingolfin commented 2 months ago

@gbaraldi could you provide a hint what "this code" is, i.e. which code needs that #ifdef?

gbaraldi commented 2 months ago

https://github.com/JuliaLang/julia/blob/2e3628d3c4993220a8ab28926970030bb7950f7c/src/runtime_intrinsics.c#L355-L376. Though I'm not sure exactly which ABI we need to use here.

I know compilers make a complete mess of this

vessokolev commented 2 weeks ago

I don't think it is a problem related to macOS. The same issue exists on Linux. Try to compile the latest Julia's code using LLVM 19. You will get similar error messages:

/opt/software/binutils/2/2.41-gold/bin/ld: ./gc-heap-snapshot.o: in function `_gc_heap_snapshot_record_hidden_edge':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text+0x2432): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./gc-heap-snapshot.o: in function `std::pair<llvm::StringMapIterator<unsigned long>, bool> llvm::StringMap<unsigned long, llvm::MallocAllocator>::try_emplace<unsigned long>(llvm::StringRef, unsigned long&&)':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text._ZN4llvm9StringMapImNS_15MallocAllocatorEE11try_emplaceIJmEEESt4pairINS_17StringMapIteratorImEEbENS_9StringRefEDpOT_[_ZN4llvm9StringMapImNS_15MallocAllocatorEE11try_emplaceIJmEEESt4pairINS_17StringMapIteratorImEEbENS_9StringRefEDpOT_]+0x1f): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./processor.o: in function `ijl_get_cpu_features':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/processor.cpp:974:(.text+0x2dc6): undefined reference to `llvm::sys::getHostCPUFeatures(llvm::StringMap<bool, llvm::MallocAllocator>&)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./coverage.o: in function `std::pair<llvm::StringMapIterator<llvm::SmallVector<unsigned long (*) [32], 0u> >, bool> llvm::StringMap<llvm::SmallVector<unsigned long (*) [32], 0u>, llvm::MallocAllocator>::try_emplace<>(llvm::StringRef)':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text._ZN4llvm9StringMapINS_11SmallVectorIPA32_mLj0EEENS_15MallocAllocatorEE11try_emplaceIJEEESt4pairINS_17StringMapIteratorIS4_EEbENS_9StringRefEDpOT_[_ZN4llvm9StringMapINS_11SmallVectorIPA32_mLj0EEENS_15MallocAllocatorEE11try_emplaceIJEEESt4pairINS_17StringMapIteratorIS4_EEbENS_9StringRefEDpOT_]+0x1b): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./runtime_intrinsics.o: in function `julia__truncsfbf2':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/runtime_intrinsics.c:379:(.text+0x571): undefined reference to `__truncsfbf2'
/opt/software/binutils/2/2.41-gold/bin/ld: ./runtime_intrinsics.o: in function `julia__truncdfbf2':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/runtime_intrinsics.c:385:(.text+0x5fc): undefined reference to `__truncsfbf2'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [Makefile:391: /project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/lib/libjulia-internal.so.1.11.1] Error 1
make: *** [Makefile:101: julia-src-release] Error 2
TidbitSoftware commented 1 week ago

For what it's worth, getting the same issue compiling HDF5 on an Intel-based Mac with the latest versions of Xcode and CLT installed (with either selected). I did update Homebrew today so not sure if there was a dependency in there that could be causing this.

Edit: On another Intel-based machine, I was able to verify that the issue was not with CLT nor Homebrew as neither have been updated for about 6 months. Rolling back to a previous version of the HDF5 source worked, so it's an issue with implementation (not properly conditioning for Intel-based machines, I believe).