zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.97k stars 6.68k forks source link

Local ISR declaration parser can break when .intList input sections are padded to an alignment >4 #81254

Open jonathonpenix opened 2 weeks ago

jonathonpenix commented 2 weeks ago

Describe the bug

The local ISR declaration parser (CONFIG_ISR_TABLES_LOCAL_DECLARATION=y) issues incorrect errors about bad IRQ values if any extra padding is inserted by the linker between the .irq_info and .intList input sections in the .intList output section definition (in intlist.ld). Currently, this can happen "naturally" when any opt level besides CONFIG_SIZE_OPTIMIZATIONS is specified and we're building with a GCC-based toolchain for AArch64.

See "Additional context" for additional info/discussion on what I think is happening.

To Reproduce

This shouldn't be limited to just the one test, but this can be reproduced as below using the Zephyr SDK/Zephyr's tests:

west twister -c -W -p qemu_cortex_a53 -s tests/kernel/interrupt/arch.shared_interrupt.lto -x=CONFIG_SPEED_OPTIMIZATIONS=y

The build should fail (see "Logs and console output" for error).

Expected behavior

We shouldn't produce incorrect/unexpected errors about invalid IRQ numbers that the user did not specify.

Impact

Generally, I think any build will currently fail when the following are true:

  1. A GCC-based toolchain is used to target AArch64
  2. CONFIG_ISR_TABLES_LOCAL_DECLARATION=y is selected
  3. An optimization level other than -Os/CONFIG_SIZE_OPTIMIZATIONS is used
  4. .intList isn't coincidentally aligned on an 8-byte boundary (so the linker will add padding)

I think this should also impact some non-AArch64 targets (ex: RISC-V 64) if/when ISR_TABLES_LOCAL_DECLARATION is ported, but currently AArch64 is the only impacted target I think. 32bit Arm seems ok experimentally.

Logs and console output

west twister -c -W -p qemu_cortex_a53 -s tests/kernel/interrupt/arch.shared_interrupt.lto -x=CONFIG_SPEED_OPTIMIZATIONS=y produces the following error:

[119/122] Generating isr_tables.c, isr_tables_vt.ld, isr_tables_swi.ld
FAILED: zephyr/isr_tables.c zephyr/isr_tables_vt.ld zephyr/isr_tables_swi.ld /workdir/zephyr/twister-out/qemu_cortex_a53_qemu_cortex_a53/tests/kernel/interrupt/arch.shared_interrupt.lto/zephyr/isr_tables.c /workdir/zephyr/twister-out/qemu_cortex_a53_qemu_cortex_a53/tests/kernel/interrupt/arch.shared_interrupt.lto/zephyr/isr_tables_vt.ld /workdir/zephyr/twister-out/qemu_cortex_a53_qemu_cortex_a53/tests/kernel/interrupt/arch.shared_interrupt.lto/zephyr/isr_tables_swi.ld
cd /workdir/zephyr/twister-out/qemu_cortex_a53_qemu_cortex_a53/tests/kernel/interrupt/arch.shared_interrupt.lto/zephyr && /usr/bin/python3 /workdir/zephyr/scripts/build/gen_isr_tables.py --output-source isr_tables.c --linker-output-files isr_tables_vt.ld isr_tables_swi.ld --kernel /workdir/zephyr/twister-out/qemu_cortex_a53_qemu_cortex_a53/tests/kernel/interrupt/arch.shared_interrupt.lto/zephyr/zephyr_pre0.elf --intlist-section .intList --intlist-section intList --sw-isr-table
gen_isr_tables.py: error: IRQ 1903323438 (offset=0) exceeds the maximum of 219                                                                             

Environment (please complete the following information):

Additional context

Using west twister -c -W -p qemu_cortex_a53 -s tests/kernel/interrupt/arch.shared_interrupt.lto -x=CONFIG_SPEED_OPTIMIZATIONS=y as an example, I think the problem is that GCC/ld currently add additional padding that the parser doesn't expect/know how to deal with. Relevant snippet from zephyr_pre0.map (note the *fill*):

.intList        0x00000000ffff8000      0x1a1
 *(SORT_BY_ALIGNMENT(.irq_info*))
 .irq_info      0x00000000ffff8000       0x14 /tmp/ccJNSHOl.ltrans1.ltrans.o
                0x00000000ffff8000                _iheader
 *(SORT_BY_ALIGNMENT(.intList*))
 *fill*         0x00000000ffff8014        0x4
 .intList       0x00000000ffff8018      0x189 /tmp/ccJNSHOl.ltrans1.ltrans.o

The IDT_LIST region starts at 0xffff8000 (from linker.ld), irq_info is (reasonably) 20 bytes, and .intList is padded to an 8-byte boundary (so, a 4-byte fill is added). The parser seems to assume that the section will only be 4-byte aligned (or, that there won't be any padding more generally), so ends parsing the input incorrectly and produces unexpected values for IRQ numbers, etc. which leads to the failure.

IIUC, the fill is inserted by ld as GCC seems to increase the alignment of the .intList input sections here (/the structs within) to 8-bytes at all opt levels besides -Os for AArch64. My understanding is that this is done intentionally by GCC, see here for AArch64.

Note that this isn't limited to AArch64--see here for RISC-V, for example. Experimentally, GCC seems willing to 8-byte align these sections for AArch64 and RISC-V 64, but 32bit Arm/RISC-V seem ok (4-byte aligned).

FWIW, it doesn't seem like clang/LLVM does this currently (though I suppose it could in the future).

jonathonpenix commented 2 weeks ago

One thought on possible fixes for this: I think GCC does still respect explicit alignment requests here.

So, I think it might be possible to prevent the increased alignment in the first place (and thus the extra padding) by doing something like __attribute__((aligned(__alignof__(struct mystruct)))) where this matters? Ex: https://godbolt.org/z/1jozK4vzx