rustbox / esp32c3-vgaterm

VGA Driver and Serial Terminal crates, with embedded applications in mind
MIT License
1 stars 0 forks source link

perf: mem{cpy,set,move}, relink _start_trap, LTO #33

Closed sethp closed 1 year ago

sethp commented 1 year ago

memcpy & friends were previously living in the flash range, and consequently we were getting really inconsistent performance out of them (especially impactful since they're called everywhere, and were very large, and so would trash the heck out of the cache).

These straightforward, unoptimized implementations are now memory resident (and, bonus, small enough that LLVM/LTO are able to eliminate many of the calls entirely), which has been a huge boon to the consistency of performance.

On the minus side, I seem to have blown the carefully tuned timing for frame chunks, and, much more worrying, adding "too much" (or the wrong amount of?) code causes us to crash hard on startup. Just, don't change things?