adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.04k stars 1.19k forks source link

M4 Express can deadlock on certain complex import chains #1283

Closed klardotsh closed 5 years ago

klardotsh commented 5 years ago

I don't have my board handy to provide a proper repro case right now, so I'll do what I can to describe the scenario until I can provide said repro case (and/or crack out my JLINK and just dive in):

Some context:

Also interestingly, updating the max stack size in boot.py does not fix this. Setting the value to anything below 650 results in the RuntimeError, anything over 700 and the modules don't have enough heap space to actually compile (I assume) and fail to import, anything in between and (if I recall correctly - this was a few days ago) I'd deadlock.

The project branch that triggered this is available here: https://github.com/KMKfw/kmk_firmware/tree/topic-planck-klaranck. In kmk/firmware.py I hack around this issue and things work - I believe removing the giant block at the top of the file (everything before Thanks for sticking around. Now let's do real work, starting below) may repro one or both of the symptoms described above when trying to use user_keymaps/klardotsh/klarank_featherm4.py as main.py

tannewt commented 5 years ago

I've seen a similar failure when the internal C code creates a stack that's bigger than the allocated stack space. It then writes onto the heap and then the moment the overwritten object is referenced it can cause a hard fault. This case may be different though.

dhalbert commented 5 years ago

We could add MP_STACK_CHECK() before the import code to see if we could catch this. This "manually" checks for the stack overflowing into the guard region, I believe.

dhalbert commented 5 years ago

@klardotsh Do you have a commit in your repo that provokes the problem? I'd like to test against it.

klardotsh commented 5 years ago

There's no commit for it explicitly (it got rebased away when I was squashing down my then-WIP branch), however removing everything above line 37 in https://github.com/KMKfw/kmk_firmware/blob/master/kmk/firmware.py (master branch of KMKfw/kmk_firmware) should trigger at least the RuntimeError - not sure if it repros the deadlock (it may?), and I didn't end up with time this weekend to construct an independent repro example, sadly.

Flashing instructions for KMK are available at https://github.com/KMKfw/kmk_firmware/blob/master/docs/flashing.md (it rsyncs over the kmk folder, a main.py - using USER_KEYMAP=user_keymaps/klardotsh/klarank_featherm4.py, you'll end up in the same state as what I use at home - as well as one dependency from micropython-lib, the string standard library polyfill)

If that doesn't repro, I'll try to assemble a specific and shrunken-down repro example this week.

tannewt commented 5 years ago

I don't think this is an issue anymore because 1) we check to make sure the stack hasn't overwritten the heap now and go into safe mode if it does and 2) we can enter safe mode manually by clicking reset when the status neopixel is yellow.