Klipper3d / klipper

Klipper is a 3d-printer firmware
GNU General Public License v3.0
8.99k stars 5.17k forks source link

Load all code into ram on the rp2040 #6464

Closed KevinOConnor closed 5 months ago

KevinOConnor commented 5 months ago

There have been a few reports of instability on the rp2040 in code with strict timing requirements (eg, neopixels and tmcuart). It is possible that rp2040 flash accesses are causing some timing instability. Typically, all the active parts of the code would be fully loaded into the cache, and thus timing should not be variable. However, it is possible that the initial cache load, or sporadic cache misses could lead to an issue.

This PR loads all of the code (along with the static data and irq vectortable) into ram on the rp2040. With this PR, there should be few (if any) flash accesses after the initial code initialization.

Loading the rp2040 code into ram was previously proposed in PR #5889.

-Kevin

KevinOConnor commented 5 months ago

FYI, I've done a little more investigation into this.

With the rp2040 running at 125Mhz a 32bit flash read takes a minimum of 320ns (20 flash cycles, 2 cpu cycles per each flash cycle). If Klipper is compiled to use "GENERIC_03H with CLKDIV 4" flash chips, then each 32bit flash read takes a little over 2us (64 flash cycles, 4 cpu cycles per each flash cycle).

Although the active parts of Klipper is likely smaller than the 16KiB cache on the rp2040, that cache is only 2-way set associative. Given that Klipper now compiles to around 32KiB, that means every cache line could alias to 4 different parts of the Klipper code/rodata. If the compiler happens to layout the code/rodata in such a way that any 3 of the active parts of Klipper map to the same cache line then it would likely result in cache evictions and subsequent flash accesses.

The flash delays are significant enough to notably alter Klipper's timing. Thus this change does seem necessary.

-Kevin