Closed Urhengulas closed 1 year ago
an unmangled symbol named Reset
is even less common than main
(which at least is common in C programs). neither symbol is specified in the ABI of the cortex-m-rt crate so relying on either could lead to breakage in the future ("future" being any new patch version, i.e. 0.7.x).
on the other hand, _stack_start
is part of the ABI of the cortex-m-rt crate so this idea may work at least for cortex-m-rt v0.7.x but the ABI could change in 0.8.0.
another way to figure out the address of the reset handler is to read the second entry in the vector table. this document states:
On system reset, the vector table is fixed at address
0x00000000
.
thought I'm not sure that is specified in the Cortex-M ISA. I know STM32 devices use 0x2000_0000
as the start address of Flash memory so firmware put the vector table at that address, instead of at 0x0000_0000
. It could be that the Flash memory at 0x2000_0000
is aliased at address 0x0000_0000
but I have not checked.
yet another way to get the address of the reset handler is to issue a reset-halt and read out the program counter (PC) register. I don't have a link to normative documentation but the Cortex-M ISA states that the program counter is set to the second entry of the vector table (the address of the reset handler) after a system reset. I think this runtime approach would be the sure way to get the address. after a reset-halt you could read the SP register and that would give you _stack_start
as well.
Thank you for the answer @japaric. A few clarification questions:
_stack_start
idea James and checking if we end up in the reset handler? Or are you suggesting that we do one or the other?🥂
I've switched to James proposal of ending the unwinding depending on the _stack_start
and that seems to work as well (see https://github.com/knurling-rs/probe-run/pull/383/commits/3e7fe2d7d693baff4f6597b4a245a164c6566289).
Atm I extract the stack start by reading the SP after a reset (as you've proposed). It seems to me that _stack_start
is the same as the start of the active_ram_region
we extract from the ELF. Is this just a coincidence, or always the case? If they are same we can just do one and not both.
Are you suggesting doing both the _stack_start idea James and checking if we end up in the reset handler? Or are you suggesting that we do one or the other?
I think it'd be good to check the stack pointer (SP) value against _stack_start
during each step of the unwinding process. if the SP value ever goes higher than _stack_start
then either the stack is corrupted or there's a bug in the unwinding logic; in either case we should stop the unwinding process.
Thinking more about it, I think it may be better not to use _stack_start == SP
as the stop condition. reason being, the reset handler could adjust the stack pointer (e.g. to make it multiple of 4 or 8 bytes so that the next stack frame adheres to e.g. the AAPCS ABI) before jumping into the next function. the reset handler itself usually does not follow any ABI (as it's written in assembly) and could lack both a function prologue and enough CFI (call frame information) to let the unwinder compute the value of SP at the start of the reset handler so it may not be possible (for the unwinder) to compute that SP was equal to _stack_start
at some point in time.
But as far as I understand it only gives me the address of the reset handler, not the size. In order to check if we end up inside of it I need both (as far as I understand)
correct.
. How can I get the size?
you should be able to iterate the symbol table to find the size (the API may be called symbols
or symbol_table
depending on whether you are using the xmas_elf
or object
library). each symbol entry has an associated start address and size (+) so you can compute the address range each symbol spans (start .. (start+size)
). you can then use reset_handler_address_span.contains(current_pc)
as the stop condition.
(+) should look bit like nm -CSn
if you print the data
$ # first column is start address; second is size; numbers are in hexadecimal format
$ arm-none-eabi-nm -CSn some-elf
(..)
00000100 00000028 T Reset
(..)
0000013c 0000000a T main
00000148 00000028 t hello::__cortex_m_rt_main
00000170 0000000a T app::exit
you asked earlier:
Does any thumb bit need to be cleared?
you can pretty much always clear the thumb bit (bit 0) when comparing addresses. the start address of a function (as seen in the objdump
output) must always be an even number; the value of PC will always be an even number.
values that eventually end up in an instructions like bl r0
("branch into function whose address is stored in register R0") have to have the thumb bit set. as those values (effectively function pointer values) are stored in static variables, they are usually stored with the bit already set to avoid setting it at runtime. when you read the second entry of the vector table (either at runtime or from the rodata stored in the ELF) you'll get an odd number.
@japaric said:
Thinking more about it, I think it may be better not to use
_stack_start == SP
as the stop condition.
What then? That unwinding ends in the reset handler? This is what I implemented in 9f40539e2d537c49ea22eb1ef7cae5f6d50b4fd6.
If this is what you mean we would have:
!cfa_changed && !program_counter_changed
) in the reset handlerSP > _stack_start
Correct?
@japaric said:
@Urhengulas said:
How can I get the size?
you should be able to iterate the symbol table to find the size
Didn't you say earlier that we cannot rely on the symbol table? I talking about following quote.
@japaric said:
an unmangled symbol named
Reset
is even less common thanmain
(which at least is common in C programs). neither symbol is specified in the ABI of the cortex-m-rt crate
Or am I misunderstanding something? Can we figure out the mangled name of the symbol?
@japaric The PR is ready for review again. I think it makes the code oven messier than it already is, and I am working on a follow-up PR to clean things up. But I wouldn't do this in here.
bors r=japaric
Build succeeded:
This PR "fixes" unwinding, by not relying on the sentinel stack frame anymore, but instead checking if unwinding ends in the
Reset
function. If unwinding ends in theReset
function, it is considered NOT corrupted; if it ends somewhere else, it is considered corrupted.Fixes #382
Todo:
Is the-> No, "an unmangled symbol namedReset
symbol always present?Reset
is even less common than main"Does any thumb bit need to be cleared?-> noTest it with an corrupted stack-> not trivially possible from Rust