Open mat-c opened 4 days ago
FWIW the options do not work for ARM: https://inbox.sourceware.org/gcc/m3fxdfudnc.fsf@google.com/T/#t They do work for M68k: https://gcc.godbolt.org/z/Y3cnE3v4n
It would be amazing if the options were supported by clang. In my baremetal development I constantly overwrite code / heap with stack. Needless to say it is hard to debug such issues.
@s-barannikov yes gcc one is buggy. It is half working https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117795
I can do some POC hackig the arm split code on llvm. But for a clean version, help will be need from llvm dev.
PS : example with clang -target thumb-v6a-linux-eabi toto_s.c -fsplit-stack -Os -S -mcpu=cortex-m0 -o totos_m0.s clang -target thumb-v7a-linux-eabi toto_s.c -fsplit-stack -Os -S -mcpu=cortex-m4 -o totos_m4.s
You can see prologue inserted by llvm. For stack only check it can be optimized and make it work for thumb-vxm.
Hi,
today llvm (and clang, rustc, ...) do not offer the option to check stack limit on hardware where stack probe is not possible. For example mcu without mmu or hw limit. mpu allow some check, but have some limitation.
gcc have -fstack-limit-register and -fstack-limit-symbol option. This allow to put the stack limit in a register or a global variable.
arm documentation give some example how to implement software stack limit :
arm proposition is to have stack limit value at least 256 bytes above it. This allow for small stack allocation (less than 256) to have a quick check that could use high register by doing direct comparison without arithmetic instruction. The 256 (STACK_GUARD) could be a configure option to limit stack that is reserved.
For small stack allocation (less than STACK_GUARD)
For bigger frame (more than STACK_GUARD) or vla, we need to compute the new stack position
See [1] for some arm assembly.
llvm support have already some support for emitting prologue according to stack allocation (split stack, stack probe, ...).
Could it be possible to implement stack limit check the same way ?
[1]
register in thumb1 : 64 bits (2 * 16 + 32) instruction
symbol in thumb1 : 128 bits (4 * 16 + 32 + 32) instruction
in thumbv2, with sub.w instruction we could have a small STACK_GUARD. We can do arithmetic operation and use high register.
if stack allocation fit in 12 bits register version
if stack allocation fit in 12 bits symbol version