This PR optimizes the stack usage measuring by not using the probe but a subroutine to search through the memory.
Fixes #258.
Measurements
The speedup is similarly impressive as in #302:
canary size
main (bcaf997cedc672ab98dc8feede19a1ca4326be05)
this PR (3ad3f86b37b56192fa6b45613ee9b9ec7b837082)
1024 B
0.007s
0.014s
261060 B
1.912s
0.028s
It makes sense that the time actually gets worse for a small canary size of 1024 bytes, since we are doing a lot of setup work (flash subroutine, set registers, set and reset program counter etc.). But we see that this totally pays off, since for a rather big canary of 256KiB we are almost 70 times faster!
Further work
This PR enables us to drastically simplify the canary logic, because since both painting and measuring are pretty fast now, we can always paint the full stack.
This PR optimizes the stack usage measuring by not using the probe but a subroutine to search through the memory.
Fixes #258.
Measurements
The speedup is similarly impressive as in #302:
It makes sense that the time actually gets worse for a small canary size of 1024 bytes, since we are doing a lot of setup work (flash subroutine, set registers, set and reset program counter etc.). But we see that this totally pays off, since for a rather big canary of 256KiB we are almost 70 times faster!
Further work
This PR enables us to drastically simplify the canary logic, because since both painting and measuring are pretty fast now, we can always paint the full stack.