d-ronin / dRonin

The dRonin flight controller software.
http://dronin.org
Other
289 stars 167 forks source link

ChibiOS: Invalid MSP from bootloader causes main stack to overlap process stack #1875

Open tracernz opened 7 years ago

tracernz commented 7 years ago

Follow-up from https://github.com/d-ronin/dRonin/pull/1874

Some bootloaders, due to unintended code generation, end up incrementing (positive) the main stack pointer by a number of bytes (8 for LibrePilot v6, 16 for dRonin recent releases). This causes the first few entries in the main stack to step on whatever happened to follow it in memory. In the bootloader updater this happened to be the flash partition table which caused the updater to fail. In firmware, this is the process stack (thread stacks): https://github.com/d-ronin/dRonin/blob/next/flight/PiOS/STM32F4xx/sections_chibios.ld#L127. Fortunately this is at the end of the stack of whatever thread happens to be first, so unless it completely fills it's stack it's probably only a theoretical issue (and filling stacks with the pattern used to detect stack usage happens in ChibiOS's ResetHandler, before any stack is ever used).

Suggested fix is either:

@mlyle, thoughts?

mlyle commented 7 years ago

Does that initial stack survive to later run, or is it just used to kickstart chibios and get the init task running?

(Does chibios just blindly reuse the initial stack as an irq stack?)

tracernz commented 7 years ago

That is the IRQ stack: https://github.com/d-ronin/dRonin/blob/next/flight/PiOS/Common/Libraries/ChibiOS/os/ports/GCC/ARMCMx/crt0.c#L280-L359

tracernz commented 7 years ago

Did a little digging. The main thread (i.e. main), uses the "process stack" (switches to it at https://github.com/d-ronin/dRonin/blob/next/flight/PiOS/Common/Libraries/ChibiOS/os/ports/GCC/ARMCMx/crt0.c#L308), while interrupts use the "main stack". So the OS is running in the process stack... I guess it never gets too close to full.

Incidentally, this doesn't look correct (and should be a function of pios_sys IMO: https://github.com/d-ronin/dRonin/blob/next/flight/Modules/System/systemmod.c#L734). We can use ChibiOS pattern (0x55555555) and symbols (___main_stack_end etc.). And we don't check the process/main thread stack at all. Do we actually even check IRQ stack? :/ https://github.com/d-ronin/dRonin/search?utf8=%E2%9C%93&q=CHECK_IRQ_STACK&type=

tracernz commented 7 years ago

Deferring because there is plenty of unused area at the bottom of the process stack so nothing bad happens.

tracernz commented 6 years ago

This needs to either be fixed in #2025, or after that merges.

glowtape commented 6 years ago

Hmmm ChibiOS 17.x crt0 can specifically set the MSP. Would that do it?

https://github.com/glowtape/dRonin/blob/glowtape-chibios17/flight/PiOS/Common/Libraries/ChibiOS/os/common/startup/ARMCMx/compilers/GCC/crt0_v7m.S#L183

tracernz commented 6 years ago

Yes, that's perfect. :+1: