Closed vinriviere closed 10 months ago
Official GCC documentation for ASM_OUTPUT_CASE_LABEL
:
https://gcc.gnu.org/onlinedocs/gccint/Dispatch-Tables.html#index-ASM_005fOUTPUT_005fCASE_005fLABEL
First, a thing that I didn't notice initially: https://github.com/freemint/m68k-atari-mint-gcc/blob/cb42ec3d119f6b2f02f5abb282397977dbae07f0/gcc/config/m68k/m68k.md#L5809-L5856
Here is where ASM_OUTPUT_CASE_LABEL
is used:
https://github.com/freemint/m68k-atari-mint-gcc/blob/cb42ec3d119f6b2f02f5abb282397977dbae07f0/gcc/final.cc#L2444-L2450
So if ASM_OUTPUT_CASE_LABEL
is used, then that macro is used to output the special case label. Otherwise, it is output as normal label.
There is no default in m68k.h. So by default, a normal label is used.
But there are overrides. elfos.h introduces ASM_OUTPUT_BEFORE_CASE_LABEL
.
Important: ASM_OUTPUT_ALIGN
parameter is a power of 2! So 2 means 4-byte.
https://github.com/freemint/m68k-atari-mint-gcc/blob/cb42ec3d119f6b2f02f5abb282397977dbae07f0/gcc/config/elfos.h#L133-L156
So it seems that m68k-linux, m68k-elf and m68k-netbsd use the same settings. No one uses the SVR4 ABI anymore?
Finally, the testcase from the initial post with various compilers:
$ m68k-linux-gcc -Os -S swi.c -o -
#NO_APP
.file "swi.c"
.text
.align 2
.globl f
.type f, @function
f:
move.l 4(%sp),%d0
moveq #7,%d1
cmp.l %d0,%d1
jcs .L1
add.l %d0,%d0
move.w .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:w)
.balignw 2,0x284c
.L4:
.word .L11-.L4
.word .L10-.L4
.word .L9-.L4
.word .L8-.L4
.word .L7-.L4
.word .L6-.L4
.word .L5-.L4
.word .L3-.L4
.L11:
move.l #6082,g
.L1:
rts
.L10:
move.l #9332,g
jra .L1
.L9:
move.l #5642,g
jra .L1
.L8:
move.l #1423,g
jra .L1
.L7:
move.l #2152,g
jra .L1
.L6:
move.l #6779,g
jra .L1
.L5:
move.l #7074,g
jra .L1
.L3:
move.l #8280,g
jra .L1
.size f, .-f
.globl g
.section .bss
.align 2
.type g, @object
.size g, 4
g:
.zero 4
.ident "GCC: (GCC MiNT ELF 20230819) 13.2.0"
.section .note.GNU-stack,"",@progbits
The jump table is in the .text
section, just after the jump instruction. There is no .swbeg
instruction. The jump table is 2-byte aligned with a move.l a4,a4
filler, which is never inserted as the opcodes are always aligned. So that's optimal (except filling the instruction cache).
m68k-elf is almost identical, but does use the .swbeg
directive, for the sake of the SVR4 ABI. So that's a waste of 4 bytes.
$ m68k-elf-gcc -Os -S swi.c -o -
#NO_APP
.file "swi.c"
.text
.align 2
.globl f
.type f, @function
f:
move.l 4(%sp),%d0
moveq #7,%d1
cmp.l %d0,%d1
jcs .L1
add.l %d0,%d0
move.w .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:w)
.balignw 2,0x284c
.swbeg &8
.L4:
.word .L11-.L4
.word .L10-.L4
.word .L9-.L4
.word .L8-.L4
.word .L7-.L4
.word .L6-.L4
.word .L5-.L4
.word .L3-.L4
.L11:
move.l #6082,g
.L1:
rts
.L10:
move.l #9332,g
jra .L1
.L9:
move.l #5642,g
jra .L1
.L8:
move.l #1423,g
jra .L1
.L7:
move.l #2152,g
jra .L1
.L6:
move.l #6779,g
jra .L1
.L5:
move.l #7074,g
jra .L1
.L3:
move.l #8280,g
jra .L1
.size f, .-f
.globl g
.section .bss
.align 2
.type g, @object
.size g, 4
g:
.zero 4
.ident "GCC: (GCC MiNT ELF 20230819) 13.2.0"
Current m68k-atari-mintelf fixed by @th-otto. This does the job. No .swbeg
, just like m68k-linux. As we agreed that .swbeg
is completely useless for us.
$ m68k-atari-mintelf-gcc -Os -S swi.c -o -
#NO_APP
.file "swi.c"
.text
.align 2
.globl f
.type f, @function
f:
move.l 4(%sp),%d0
moveq #7,%d1
cmp.l %d0,%d1
jcs .L1
add.l %d0,%d0
move.w .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:w)
.balignw 2,0x284c
.L4:
.word .L11-.L4
.word .L10-.L4
.word .L9-.L4
.word .L8-.L4
.word .L7-.L4
.word .L6-.L4
.word .L5-.L4
.word .L3-.L4
.L11:
move.l #6082,g
.L1:
rts
.L10:
move.l #9332,g
jra .L1
.L9:
move.l #5642,g
jra .L1
.L8:
move.l #1423,g
jra .L1
.L7:
move.l #2152,g
jra .L1
.L6:
move.l #6779,g
jra .L1
.L5:
move.l #7074,g
jra .L1
.L3:
move.l #8280,g
jra .L1
.size f, .-f
.globl g
.section .bss
.align 2
.type g, @object
.size g, 4
g:
.zero 4
.ident "GCC: (GCC MiNT ELF 20231027) 13.2.0"
@th-otto: I have rewoked the ASM_RETURN_CASE_JUMP
patch to make mint.h look more like linux.h: https://github.com/freemint/m68k-atari-mint-gcc/commit/f6f7f11429b09de9510fc73fbc8159530f2ac6fc. This doesn't change the ABI (still no .swbeg) and should not cause trouble to your targets. It can easily be verified using the provided testcase.
By making everything explicit, this should help to ultimately get rid of m68kelf.h.
Why did you remove #undef ASM_OUTPUT_BEFORE_CASE_LABEL
? It is not the swbeg directive that causes trouble, but if this macro is not defined, it will be defined in elfos.h to align to 4 bytes. That might emit a zero value before the table, making it misaligned with the jump in https://github.com/freemint/m68k-atari-mint-gcc/blob/cb42ec3d119f6b2f02f5abb282397977dbae07f0/gcc/config/m68k/m68kelf.h#L69.
Where to put the jump table. As it is read-only data, it should go into the .rodata section.
No definitely not. It must immediately follow the jmp statement above.
Where to put the jump table. As it is read-only data, it should go into the .rodata section.
No definitely not. It must immediately follow the jmp statement above.
.rodata
section is for read-only data. Any offset table is read-only data. So jump tables should theoretically go to .rodata
, like any data. But that's a non-topic, as we all know that it would produce less efficient code, table being potentially far from the jump. And as you notice, this wouldn't be compatible with actual jump code. Be sure we completely agree.
Why did you remove
undef ASM_OUTPUT_BEFORE_CASE_LABEL
?
I did it to have the same configuration as linux.h, as we have the same goal: word jump tables in text section, without swbeg, just after the jump. Similar configurations will ease upgrading to new gcc versions. Of course I carefully checked that the result was correct (not changed).
[still about ASM_OUTPUT_BEFORE_CASE_LABEL] if this macro is not defined, it will be defined in elfos.h to align to 4 bytes.
Yes, but it isn't used.
ASM_OUTPUT_BEFORE_CASE_LABEL
is used only inside ASM_OUTPUT_CASE_LABEL
definition inside elfos.h:
https://github.com/freemint/m68k-atari-mint-gcc/blob/cb42ec3d119f6b2f02f5abb282397977dbae07f0/gcc/config/elfos.h#L149-L156
And now mint.h, just like linux.h, uses #undef ASM_OUTPUT_CASE_LABEL. So ASM_OUTPUT_BEFORE_CASE_LABEL isn't used anymore.
That might emit a zero value before the table, making it misaligned with the jump
I remember that you had trouble with that, and that I also had similar trouble with old gcc 4.
With the testcase from the top of this thread:
$ m68k-atari-mintelf-gcc -Os -c swi.c
$ m68k-atari-mintelf-objdump -d swi.o
swi.o: file format elf32-m68k
Disassembly of section .text:
00000000 <f>:
0: 202f 0004 movel %sp@(4),%d0
4: 7207 moveq #7,%d1
6: b280 cmpl %d0,%d1
8: 6524 bcss 2e <f+0x2e>
a: d080 addl %d0,%d0
c: 303b 0806 movew %pc@(14 <f+0x14>,%d0:l),%d0
10: 4efb 0002 jmp %pc@(14 <f+0x14>,%d0:w)
14: 0010 001c orib #28,%a0@
18: 0028 0034 0040 orib #52,%a0@(64)
1e: 004c .short 0x004c
20: 0058 0064 oriw #100,%a0@+
24: 23fc 0000 17c2 movel #6082,0 <f>
2a: 0000 0000
2e: 4e75 rts
30: 23fc 0000 2474 movel #9332,0 <f>
36: 0000 0000
3a: 60f2 bras 2e <f+0x2e>
...
Sorry for the ugly disassembly. We can see the jump at offset 0x10. It is immediately followed by 8 short offsets, as expected. Then by a move.l, as first case. Note that the table is at offset $14, not a multiple of 4. This code works well, I traced it with MonST2.
Maybe there was another special case in the code that caused you trouble with wrong table alignment? Any idea for another testcase?
Maybe there was another special case in the code that caused you trouble with wrong table alignment?
Yes, because in the example above, the first word of the jumptable happens to be at a 4byte aligned address (0x14). You would need to produce an example where that is not the case.
Any idea for another testcase?
When i encountered that issue, it was some table in the unwinding code. But i'm quite sure there are easier cases to reproduce it. But i'm quite sure that that it is still needed, because i had to backport it also for older gcc versions.
Note that the table is at offset $14, not a multiple of 4.
D'oh! My bad. I've been fooled by decimal. $14 is of course a multiple of 4.
Yes, because in the example above, the first word of the jumptable happens to be at a 4byte aligned address (0x14). You would need to produce an example where that is not the case.
I will do.
This was easy. I just had to add a nop to change the alignment.
swi.c
int g;
void f(int i)
{
asm("nop");
switch (i)
{
case 0: g = 6082; break;
case 1: g = 9332; break;
case 2: g = 5642; break;
case 3: g = 1423; break;
case 4: g = 2152; break;
case 5: g = 6779; break;
case 6: g = 7074; break;
case 7: g = 8280; break;
}
}
$ m68k-atari-mintelf-gcc -Os -S swi.c -o -
#NO_APP
.file "swi.c"
.text
.align 2
.globl f
.type f, @function
f:
move.l 4(%sp),%d0
#APP
| 5 "swi.c" 1
nop
| 0 "" 2
#NO_APP
moveq #7,%d1
cmp.l %d0,%d1
jcs .L1
add.l %d0,%d0
move.w .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:w)
.balignw 2,0x284c
.L4:
.word .L11-.L4
.word .L10-.L4
...
$ m68k-atari-mintelf-gcc -Os -c swi.c
$ m68k-atari-mintelf-objdump -d swi.o
swi.o: file format elf32-m68k
Disassembly of section .text:
00000000 <f>:
0: 202f 0004 movel %sp@(4),%d0
4: 4e71 nop
6: 7207 moveq #7,%d1
8: b280 cmpl %d0,%d1
a: 6524 bcss 30 <f+0x30>
c: d080 addl %d0,%d0
e: 303b 0806 movew %pc@(16 <f+0x16>,%d0:l),%d0
12: 4efb 0002 jmp %pc@(16 <f+0x16>,%d0:w)
16: 0010 001c orib #28,%a0@
1a: 0028 0034 0040 orib #52,%a0@(64)
20: 004c .short 0x004c
22: 0058 0064 oriw #100,%a0@+
26: 23fc 0000 17c2 movel #6082,0 <f>
2c: 0000 0000
30: 4e75 rts
32: 23fc 0000 2474 movel #9332,0 <f>
38: 0000 0000
3c: 60f2 bras 30 <f+0x30>
...
This time the jump table is really at an address non-multiple of 4: 0x16. But it's still correct. No additional filler in the table.
Phew! I finally managed to reproduce @th-otto's issue. But I needed to add both -mlong-jump-table-offsets
and -malign-int
.
Key point is the code in the ASM_RETURN_CASE_JUMP
macro. It uses instructions like jmp %%pc@(2,%0:w)
. Note that the offset "2" is hardcoded. That's a nonsense, because it should normally be the label of the first jump table entry. I guess this has been done that way because that macro has no way to know the name of the jump table label. As a consequence, the jump table must immediately follow that jmp, otherwise the offset will be wrong.
By default, the generated code is like this:
move.w .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:w)
.balignw 2,0x284c
.L4:
.word .L11-.L4
.word .L10-.L4
That's good. Even if the .balignw 2
is useless, as 68k instructions are always aligned on 2.
But:
then a filler would be inserted, causing the jmp offset being wrong.
I managed to produce such situation that way. Due to alignment randomness, I needed to use the non-nop testcase.
$ m68k-atari-mintelf-gcc -Os -S -o - swi.c -mlong-jump-table-offsets -malign-int
...
move.l .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:l)
.balignw 4,0x284c
.L4:
.long .L11-.L4
.long .L10-.L4
...
When misaligned, it produces the following disassembly (wrong, as expected):
16: 203b 0808 movel %pc@(20 <f+0x20>,%d0:l),%d0
1a: 4efb 0802 jmp %pc@(1e <f+0x1e>,%d0:l)
1e: 284c moveal %a4,%a4
20: 0000 0020 orib #32,%d0
24: 0000 002c orib #44,%d0
Of course the movea.l a4,a4
is a nonsense: never run, but moving the jump table too far.
This only happens when 2 conditions are met:
-mlong-jump-table-offsets
)-malign-int
)This looks a bit far-fetched. I guess that @th-otto's real-world testcase used different options, but with the same result.
Then question is where that .balignw
comes from. Long story short: it's about the ADDR_VEC_ALIGN
macro. When not defined, it defaults to BIGGEST_ALIGNMENT
And that BIGGEST_ALIGNMENT depends if -malign-int
is used or not.
Solution: Force ADDR_VEC_ALIGN
to 0:
https://github.com/freemint/m68k-atari-mint-gcc/commit/67b1e5ec6dc4eaf70cf95807f03aa252ad499262
I pushed that: no more .balignw
, no more potential bug.
move.l .L4(%pc,%d0.l),%d0
jmp %pc@(2,%d0:l)
.L4:
.long .L11-.L4
.long .L10-.L4
Crazy thing is that m68k-elf and m68k-linux suffers of the same bug! So this is now fixed for the mintelf target, but remains a problem for other m68k targets. I will open an issue in GCC Bugzilla for that.
EDIT: Done. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112413
Summary:
ASM_RETURN_CASE_JUMP
is caused by a general bug inside that macro (hardcoded offset). Among others, it affects m68k-elf and m68k-linux targets.ADDR_VEC_ALIGN
. I did that on the mintelf branch, there is no more trouble. Generated code is always safe and optimal.Maybe the GCC team will officially fix that bug, or keep it opened for decades. Anyway, the ADDR_VEC_ALIGN
trick works well and safely, and keeps us out of trouble. So I propose to close this issue.
Nice job, one more issue down. Btw thanks to your work on the mintelf target I have found a bug in gcc, too: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112712. :)
The GCC team has fixed the bug on the master
branch for the upcoming GCC 14.
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=eea25179d8d1406685b8b0995ba841605f895283;hp=f5fc001a84a7dbb942a6252b3162dd38b4aae311
"No plans to backport." ... pff. So what, do we do that by ourselves or just upgrade to 14.0 when released?
We can also backport it ourselves. But i would still stick to the settings that make sure that no nop is inserted there, since that is a total waste.
What would be more interesting, is to have a function attribute for long-jump-table-offsets instead of using a global switch, since that is really rarely needed (only when the function is >32K, i've only encountered that so far in the z88dk parser)
Question is how to configure
ASM_OUTPUT_BEFORE_CASE_LABEL
in GCC. We already discussed that, I open this issue for reference and documentation.A jump table is a data table used for gcc to efficiently compile big
switch
statements. This particularly occurs when there are many contiguouscase
statements. Offsets of different case labels are stored inside an array indexed by the switch value.Testcase: swi.c
There are 2 issues:
1) Where to put the jump table. As it is read-only data, it should go into the
.rodata
section. On the other hand, if the jump table is located in the.text
section just after the index computation instructions, it can benefit of compact pc-relative addressing. Drawback of.text
location is that it wastes some space in the instruction cache (only available on advanced processors).2) The SVR4 ABI requires a special directive
.swbeg
before the actual jump table, when put inside the.text
section. It serves 2 purposes:.text
section..swbeg
insert 2 words:So when
.swbeg
is used, the jump instruction just before it must be adjusted to take in account that extra offset.Reference SVR4 documentation: http://www.bitsavers.org/pdf/motorola/sysv/System_V_68_Assembler_Users_Guide_Nov1984.pdf
gas/gdb instruction definition for
.swbeg
: https://github.com/freemint/m68k-atari-mint-binutils-gdb/blob/90d470c102e6f79a8625df90706109849404d86c/opcodes/m68k-opc.c#L2179-L2186Next I'm going to check what gcc does in various configurations.