Make stack painting fast again! 🇪🇺

Urhengulas commented 2 years ago

This PR implements the first one of improvements outlined in #258.

Fixes #258.

But what is "stack painting" anyways?

The idea is to write a specific byte pattern to (part of) the stack before the program is getting executed. After the program finished, either because it is done with its task, or because there was an error, we read out the previously painted area and check how much of it is still intact. If the pattern is still the same, we can be rather certain that the program didn't write to this part of the stack. This information helps to either know if there was a stack overflow, or just to measure how much of the stack was used.

So far both reading and writing of the memory was done via the probe. While this works it is also rather slow, because the host and probe communicate via USB which takes time.

The new approach is writing a subroutine to the MCU, which will paint the memory from within.

Mesurements

In following table you can see the measurement how much time the old and new approach take for memory from 8 to 256KiB.

data

The results are pretty impressive. The new approach is about 170 times faster!

Further work

A similar approach can also be applied to reading out the stack after the program finished.
Additionally the stack canary can be simplified quite a lot. So far we are not painting the whole stack, except the user asks for it, because this was slow. Because it is fast now we can always paint all of it, which simplifies the code and removes the need for the --measure-stack flag.

Urhengulas commented 2 years ago

The error messages can probably also be improved, but I'd like to do this in a follow-up PR, when reworking the canary (see "Additionally the stack canary can be simplified quite a lot. [...]".

Urhengulas commented 2 years ago

bors r+

Urhengulas commented 2 years ago

bors cancel

bors[bot] commented 2 years ago

Canceled.

Urhengulas commented 2 years ago

bors r=jonathanpallant

bors[bot] commented 2 years ago

Build succeeded:

ci

knurling-rs / probe-run