Closed simonjwright closed 4 months ago
hmm..
I had a successful bootstrap using GCC-11.4 (including Ada, D, m2 and rust); using XC CLT-14.3 on aarch64-darwin21. I see you have a whole bunch of configure options, some of which are unnecessary and some of which I do not use/test.
non-bootstrap builds from a different compiler version are not really supported - does a bootstrap work correctly?
I have XC CLT 15.1b for which the assembler does not run on darwin21, will have to update if it now works.
- I had a successful bootstrap using GCC-11.4 (including Ada, D, m2 and rust); using XC CLT-14.3 on aarch64-darwin21. I see you have a whole bunch of configure options, some of which are unnecessary and some of which I do not use/test.
Now that I have a working gcc-13.1.0-aarch64
I've stopped building the cross-compiler first; is this wrong?
I guess the --with-as
etc. configure options aren't needed, would be interesting to have a recommended set! There's some cargo-culting going on here, admittedly.
- non-bootstrap builds from a different compiler version are not really supported - does a bootstrap work correctly?
No, fails exactly the same in stage 1.
- I have XC CLT 15.1b for which the assembler does not run on darwin21, will have to update if it now works.
I've probably given the wrong impression here; both my aarch64 machines are running Sonoma (14.1.1), I set MACOSX_DEPLOYMENT_TARGET=12
to support users who haven't yet upgraded, for whatever reason.
I think we have an issue with XC CLT 15.1b3 (and possibly earlier), because the libgomp.dylib
issue goes away if I build with CLT 14.2.
This is with commit 31499d1 of 2023-11-22.
Build compiler: GCC 13.1.0, aarch64-apple-darwin21
ld: address=0x0 points to section(3) with no content in '/Volumes/Miscellaneous3/aarch64/14.0.0/gcc/aarch64-apple-darwin21/libgomp/.libs/target-indirect.o'
It turns out that this is yet another ld-classic
problem. Successfully built using a shim:
#!/bin/sh
classic=$(xcrun --find ld-classic 2>/dev/null) || true
if [ -n "$classic" ]; then
exec $classic "$@"
else
exec ld "$@"
fi
More to come on this over at https://github.com/iains/gcc-13-branch/issues/10
Have you reported the issue to Apple?
Have you reported the issue to Apple?
FB13416813
Mind, after having submitted the report I dug a bit further. Turns out that target-indirect.c
is
void *
GOMP_target_map_indirect_ptr (void *ptr)
{
/* Calls to this function should not be generated for host code. */
__builtin_unreachable ();
}
Compiling this with gcc-13.1.0-x86_64 gives a sensible-looking EH_frame, but compiling with gcc-13.1.0-aarch64 gives this, which looks garbled to me:
$ objdump -h target-indirect.o
target-indirect.o: file format mach-o arm64
Sections:
Idx Name Size VMA Type
0 __text 00000000 0000000000000000 TEXT
1 __text_cold 00000000 0000000000000000 TEXT
2 __eh_frame 00000038 0000000000000000 DATA
$ objdump -D target-indirect.o
target-indirect.o: file format mach-o arm64
Disassembly of section __TEXT,__eh_frame:
0000000000000000 <ltmp2>:
0: 00000014 udf #20
4: 00000000 udf #0
8: 00527a01 <unknown>
c: 011e7801 <unknown>
10: 001f0c10 <unknown>
14: 00000000 udf #0
18: 0000001c udf #28
1c: 0000001c udf #28
20: ffffffe0 <unknown>
24: ffffffff <unknown>
...
target-indirect.c
appears to have been added in a49c7d3; it’s in config/accel/
and config/linux/
-- we’ve picked up the linux version, the accel version is much more substantial.
The feedback (FB13416813) has been updated:
The error is related to the _GOMP_target_map_indirect_ptr symbol, located in the TEXT, text_cold section. This entire section is empty, so the symbol has no content, but there’s still a dwarf unwind entry referencing it. You might be able to workaround this error by either removing this symbol, or making sure it has some content.
So that looks like an error on our [GCC's] part (or possibly an assumption that something that works with BFD-linkers is OK everywhere), that has not been detected by earlier linkers.
since this is x86_64 the problem happen with unpatched 13.2? If so, then we should have an upstream (GCC bugzilla) for it;
I'm somewhat tied up with other stuff right now, so not really able to suggest a short-term hack.
Using ld-classic
is probably a good idea anyway for now. I'll try to debug the issue over the week-end and reduce it to a simple case.
Using
ld-classic
is probably a good idea anyway for now. I'll try to debug the issue over the week-end and reduce it to a simple case.
I wonder if we have a case where there's an empty TU for some targets (but then I don't see why we'd end up with a symbol there). [I've not tried to debug, and will most likely not have a chance this week]
ld -ld_classic
accepts the same object file without even a warning, so clearly it is a regression from AppleReduced testcase, with ld
being the Xcode 15.1 Release Candidate linker:
$ cat a.c
void * GOMP_target_map_indirect_ptr (void *ptr) {
__builtin_unreachable ();
}
$ gcc-13 -c a.c -g -O2
$ ld -dynamic -o libtest.dylib a.o -dylib
ld: address=0x0 points to section(3) with no content in '/private/tmp/a.o'
clang output makes ld happy:
$ clang -c a.c -g -O2
$ ld -dynamic -o libtest.dylib a.o -dylib
[no error]
Trying to narrow the difference in what is output:
$ clang -c a.c -g -O2
$ nm a.o
0000000000000000 T _GOMP_target_map_indirect_ptr
0000000000000000 t ltmp0
0000000000000220 s ltmp1
$ gcc-13 -c a.c -g -O2
$ nm a.o
0000000000000028 s EH_frame1
0000000000000000 S _GOMP_target_map_indirect_ptr
0000000000000000 t ltmp0
0000000000000000 s ltmp1
0000000000000000 s ltmp2
0000000000000028 s ltmp3
0000000000000164 s ltmp4
0000000000000197 s ltmp5
00000000000001a9 s ltmp6
- I confirm the bug
ld -ld_classic
accepts the same object file without even a warning, so clearly it is a regression from AppleTrying to narrow the difference in what is output:
$ gcc-13 -c a.c -g -O2 $ nm a.o 0000000000000028 s EH_frame1 0000000000000000 S _GOMP_target_map_indirect_ptr 0000000000000000 t ltmp0 0000000000000000 s ltmp1 0000000000000000 s ltmp2 0000000000000028 s ltmp3 0000000000000164 s ltmp4 0000000000000197 s ltmp5 00000000000001a9 s ltmp6
please could you show the output of objdump -d -r a.o
On my very quick test, I do not see the section being empty (but, instead, containing a single trap instruction)
edit: FAOD, this is with x86_64, right?
specifically, if I compile with -save-temps
and look at the assembler:
.file 1 "f.c"
.section __TEXT,__text_cold,regular,pure_instructions
.globl _GOMP_target_map_indirect_ptr
_GOMP_target_map_indirect_ptr:
LFB0:
.loc 1 1 49
.loc 1 2 3
ud2
LFE0:
If, hypothetically, that is also what you see - but the output of objdump -d -r
does not look the same, then the issue is with the assembler (i.e. clang -cc1as
) rather than ld
.
Since I am reporting here, I am testing the aarch64-darwin, with the current branch. I don't have a system with Xcode 15.1 on Intel :(
meau /tmp $ clang -O2 -g a.c -c
meau /tmp $ objdump -d -r a.o
a.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000000 <ltmp0>:
0: d4200020 brk #0x1
meau /tmp $ gcc-13 -O2 -g a.c -c
meau /tmp $ objdump -d -r a.o
a.o: file format mach-o arm64
meau /tmp $
The assembly generated by GCC is:
.arch armv8-a
.text
Ltext0:
.file 1 "a.c"
.section __TEXT,__text_cold,regular,pure_instructions
.align 2
.globl _GOMP_target_map_indirect_ptr
_GOMP_target_map_indirect_ptr:
LFB0:
.loc 1 1 49
.loc 1 2 3
LFE0:
and by clang:
.section __TEXT,__text,regular,pure_instructions
.build_version macos, 14, 0 sdk_version 14, 2
.globl _GOMP_target_map_indirect_ptr ; -- Begin function GOMP_target_map_indirect_ptr
.p2align 2
_GOMP_target_map_indirect_ptr: ; @GOMP_target_map_indirect_ptr
Lfunc_begin0:
.file 1 "/tmp" "a.c"
.loc 1 1 0 ; a.c:1:0
.cfi_startproc
; %bb.0:
.loc 1 2 3 prologue_end ; a.c:2:3
brk #0x1
Ltmp0:
Lfunc_end0:
.cfi_endproc
Since I am reporting here, I am testing the aarch64-darwin, with the current branch. I don't have a system with Xcode 15.1 on Intel :(
The assembly generated by GCC is:
.arch armv8-a .text Ltext0: .file 1 "a.c" .section __TEXT,__text_cold,regular,pure_instructions .align 2 .globl _GOMP_target_map_indirect_ptr _GOMP_target_map_indirect_ptr: LFB0: .loc 1 1 49 .loc 1 2 3 LFE0:
That is different from the x86_64 case (there is indeed no content here) ... whereas....
and by clang:
.section __TEXT,__text,regular,pure_instructions .build_version macos, 14, 0 sdk_version 14, 2 .globl _GOMP_target_map_indirect_ptr ; -- Begin function GOMP_target_map_indirect_ptr .p2align 2 _GOMP_target_map_indirect_ptr: ; @GOMP_target_map_indirect_ptr Lfunc_begin0: .file 1 "/tmp" "a.c" .loc 1 1 0 ; a.c:1:0 .cfi_startproc ; %bb.0: .loc 1 2 3 prologue_end ; a.c:2:3 brk #0x1 Ltmp0: Lfunc_end0: .cfi_endproc
.... clang is putting a trap instruction in.
So, on aarch64, we do have a discrepancy - I need to figure out where the aarch64 port decides to/not to insert the trap.
see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109267 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 (I will have to see if a similar fix can apply to aarch64).
note that the workaround from PR57438 does also appear to work for aarch64.
-D__builtin_unreachable=__builtin_trap
(maybe add that to the recipe for the affected files, rather than globally - although TBH a trap is more user-friendly than UB .. )
unfortunately, this remedy does not seem to work for modula-2 (which also has instances of this issue) - so a proper solution is called for - and it seems to be a bit more tricky for aarch64 than the solutions I did for x86 and powerpc.
My testing with a cross to aarch64-darwin from x86_64-darwin on macOS 14 with Xcode 15.3b2, suggests that this is working - actually, there is some fallout from the change (but right now, I think that the change actually has identified a second problem - which is not limited to the arm64 port).
Please test the latest master-wip-apple-si
branch; if all goes OK then I'll backport for 13.3, 12.4 and 11.5.
The feedback (FB13416813) has been updated:
The error is related to the _GOMP_target_map_indirect_ptr symbol, located in the TEXT, text_cold section. This entire section is empty, so the symbol has no content, but there’s still a dwarf unwind entry referencing it. You might be able to workaround this error by either removing this symbol, or making sure it has some content.
What is the situation with the FB now?
We have just had a long discussion on IRC and on the gcc-patches list about solutions to the underlying problem (empty functions because of any reason - e.g. macro-conditional content, optimised away etc).
The assertion of global maintainers is that, generally empty content is to be expected in real-life code.
e.g. asm (""); __builtin_unreachable (); will result in that too (or asm which actually has some large template, but either expands just into a different section, or has macros that yield nothing)
but generally, DW_CFA_advance_loc* can always skip over something that is empty and not known at compile time (like inline asms that don't contribute anything to the current section), so generally having something to apply for an empty range is well defined DWARF construct
So, I can fix the current case (i.e. a function optimised to __builtin_unreachable) to produce a trap there - but it seems that there's potentially a wider issue/
Since FBs are not public - please could you update ?
@fxcoudert is this the only FB for the topic? (given that we read it as a regression from ld64)? I wonder if there's some way to either expedite - or if it won't be fixed then to find out soon so that we can try to react in the compiler.
Since FBs are not public - please could you update ?
There’s been no update to the FB since my last report.
My report was
The file target-indirect.o is generated as part of libgomp.dylib during GCC 14.0.0 build for arm64 (sources at https://github.com/iains/gcc-darwin-arm64).
While doing the link to produce the dylib, ld reports ld: address=0x0 points to section(3) with no content in '/Users/simon/Developer/bugs/gcc/ld_classic/.libs/target-indirect.o'
I've created an attachment (target-indirect.zip) containing the object file concerned.
With ld: $ ld -dylib -o libgomp.dylib objs/*.o -no_compact_unwind -syslibroot $(xcrun --show-sdk-path) -lSystem ld: address=0x0 points to section(3) with no content in '/Users/simon/Developer/bugs/gcc/ld_classic/objs/target-indirect.o'
Using ld-classic $ $(xcrun --find ld-classic) -dylib -o libgomp.dylib objs/*.o -no_compact_unwind -syslibroot $(xcrun --show-sdk-path) -lSystem runs successfully.
The response was
The error is related to the _GOMP_target_map_indirect_ptr symbol, located in the TEXT, text_cold section. This entire section is empty, so the symbol has no content, but there’s still a dwarf unwind entry referencing it. You might be able to workaround this error by either removing this symbol, or making sure it has some content.
and the summary at the top is
Recent Similar Reports: None Resolution: Investigation complete - Works as currently designed
and the summary at the top is
Recent Similar Reports: None Resolution: Investigation complete - Works as currently designed
Which I translate as "we are not going to fix this, it's your problem to generate code that does not cause this".
@fxcoudert do you know of any other FBs in the system?
@fxcoudert do you know of any other FBs in the system?
Not aware of any, will ping that one.
Please test the latest
master-wip-apple-si
branch; if all goes OK then I'll backport for 13.3, 12.4 and 11.5.
I just did a bootstrap (C, C++, Ada) on M1 macOs 14.4.1, CLT 15.3, base compiler 13.2.0-aarch64; built without issues.
I believe that this is fixed on the development branch, and will be back ported to 13.3, 12.4 and 11.5.
This is with commit 31499d1 of 2023-11-22.
Build compiler: GCC 13.1.0, aarch64-apple-darwin21
Configure script, with
$BUILD=aarch64-apple-darwin21
$SDKROOT=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
whereCommandLineTools
is a symlink toCommandLineTools-15.1b3
$BOOTSTRAP=disable