littlekernel / lk

LK embedded kernel
MIT License
3.11k stars 611 forks source link

Exception occurs when SVE instructions used on aarch64 #373

Open schultetwin1 opened 1 year ago

schultetwin1 commented 1 year ago

SVE (Scalable Vector Extension) is an extension to the aarch64 architecture. It was an optional extension introduced in ARMv8.2-A and became built in for ARMv9-A. LK currently does not disable the traps for SVE use meaning you will get an exception if you use SVE instructions.

This is particularly a problem if you use an SVE instruction before VBAR_EL1 has been configured. As happens in LK's initialization code. The stack trace being

0: memset
1: init_thread_struct
2: thread_init_early
3: lk_main

Clang/LLVM will insert an SVE instruction into LK's implementation of memset when optimizing code.

To work around this, we have added the following two compilation flags to clang.

This workaround works great for us because we don't need to use any SIMD instructions (SVE is a type of SIMD). I'm opening this issue as an FYI in case others run into it.

Below is a list of what I believe needs to be done to get SVE instructions working in LK in case anyone needs to do this. This has not been tested and there maybe more steps.

1) Disable the EL2 coprocessor traps for SVE in arm64_el3_to_el1 https://github.com/littlekernel/lk/blob/ca633e2cb2e8029cc0312829a39ef6b1c31b65f1/arch/arm64/asm.S#L56-L58 That code should (I believe) read:

    mov x0, #0x333ff
    msr cptr_el2, x0

In order to set the ZEN bits in CPTR_EL2. The same change needs to be made in arm64_elX_to_el1.

2) Disable the SVE traps in EL1 https://github.com/littlekernel/lk/blob/ca633e2cb2e8029cc0312829a39ef6b1c31b65f1/arch/arm64/asm.S#L60-L62

    mov x0, #((0b11<<20) | (0b11 << 16))
    msr cpacr_el1, x0

In order to set the ZEN bits in the CPACR_EL1 register. This also needs to be done in arm_reset

3) Properly configure the ZCR_Elx control registers

4) Update arm64_fpu_exception to handle SVE exceptions as well for the lazy loading of SVE registers (I'm not sure if they overlap with the FPU registers or not).

5) Update `arm64_fpu_pre_context_switch to properly lazy save SVE registers