Sys_ContextSave1
MRS R0,PSP ; Get PSP
SUBS R0,R0,#32 ; Adjust address // NOTE that SP is subed 32 bytes to store the R4~R11. All the register now should be 0 as it is cleared when switch to nonsecure from secure
STR R0,[R1,#TCB_SP_OFS] ; Store SP
STMIA R0!,{R4-R7} ; Save R4..R7
MOV R4,R8
MOV R5,R9
MOV R6,R10
MOV R7,R11
STMIA R0!,{R4-R7} ; Save R8..R11
How does the issue happen? Thread A set the 'tz' as 0 while it calls a secure API and is running in the secure world. Nonsecure system tick happens, and Thread A is switched to ready state and its context is saved. https://github.com/ARM-software/CMSIS_5/blob/5.5.0/CMSIS/RTOS2/RTX/Source/ARM/irq_armv8mbl.s#L260
Then Thread B starts to run. It runs only in a nonsecure world. And it entered into waiting state while it is requiring a mutex. Then Thread A is restored in the SVC handler. But when restoring the context of Thread A, it does not pop the R4~R11 registers for its SP https://github.com/ARM-software/CMSIS_5/blob/5.5.0/CMSIS/RTOS2/RTX/Source/ARM/irq_armv8mbl.s#L150:
So then the SP of Thread A is still stacking R4~R11. So when Thread A runs back to nonsecure from secure, then wrong LR value is pop from the SP. The pop value should always be 0 as the actual poped register is R0 which is push in https://github.com/ARM-software/CMSIS_5/blob/5.5.0/CMSIS/RTOS2/RTX/Source/ARM/irq_armv8mbl.s#L260. So secure fault happen when executing bx lr.