littlekernel / lk

LK embedded kernel
MIT License
3.11k stars 611 forks source link

Cortex A8 target generating SIMD code in timer_tick interrupt handler #406

Closed samcday closed 1 month ago

samcday commented 1 month ago

I'm running LK on an msm8916 device in arm32 mode, targeting the cortex-a8 CPU.

When testing a simple sleepm 1, I get the following fault:

entering main console loop
] sleepm 1
panic (caller 0x80001505): floating point code in irq context. pc 0x800080aa
HALT: spinning forever... (reason = 9)

Dumping that PC reveals:

$ arm-none-eabi-addr2line -p -e build-msm8916/lk.elf 0x800080aa 
/var/home/sam/src/lk/top/include/lk/list.h:63

$ arm-none-eabi-objdump -S -d build-msm8916/lk.elf | grep -C4 800080aa
800080a2:   f040 8083   bne.w   800081ac <timer_tick+0x19c>
    item->next->prev = item->prev;
800080a6:   e9d8 2001   ldrd    r2, r0, [r8, #4]
    item->prev = item->next = 0;
800080aa:   efc0 0010   vmov.i32    d16, #0 @ 0x00000000
    item->next->prev = item->prev;
800080ae:   6002        str r2, [r0, #0]
    item->prev->next = item->next;
800080b0:   6050        str r0, [r2, #4]

If I change arm/toolchain.mk to -mfpu=vfpv3 for cortex-a8 the issue goes away.

My GCC version:

$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (Fedora 13.2.0-5.fc40) 13.2.0
travisg commented 1 month ago

Interesting! The mixture of float and non float in the kernel has always been an issue one way or another, but seems that the compiler is getting more aggressive about using vector bits to speed up non float code and as far as I know there's no switch to turn that off aside from just telling the compiler there's no vector unit.

Will have to think about what to do there, but I think partially the answer is probably to start compiling more parts of the kernel, like whatever module this particular code is in, without floating point in the MODULE_OPTIONS options. It's also possible the older ARM32 stuff does't have the float/nofloat stuff wired up properly. It's been a solid 15 years since i've seen a cortex-a8 in the wild.

samcday commented 1 month ago

I found it quite surprising that such a simple piece of code (a_ptr = b_ptr = 0) was lowered to a SIMD instruction.

but I think partially the answer is probably to start compiling more parts of the kernel, like whatever module this particular code is in, without floating point in the MODULE_OPTIONS options

Seems reasonable. What emboldened me to open this issue was seeing your comment elsewhere noting that FPU/NEON isn't expected to work in the bowels of kernel mode.

It's been a solid 15 years since i've seen a cortex-a8 in the wild.

Heh, indeed. Actually the msm8916 SoCs are cortex-a53, but I haven't managed to get LK arm64 to boot properly yet, so I'm still using arm32 and targeting cortex-a8 the same way lk2nd is.

travisg commented 1 month ago

I think I have it fixed with this CL above. Basically needed to add support for arm32 arch to support the float/nofloat ARCH_COMPILEFLAGS. I had done it for the 64bit arches, but hadn't gone back and added it to arm32.

travisg commented 1 month ago

Okay, needed a second commit to fix it a bit better for older compilers. Give it a whirl!

samcday commented 1 month ago

Confirmed, works a treat. Thanks a bunch!