adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.06k stars 1.2k forks source link

watchdog stalling code execution on samd51 #5757

Open bludin opened 2 years ago

bludin commented 2 years ago

CircuitPython version

Adafruit CircuitPython 7.1.0-beta.1-27-g013e688c9-dirty on 2021-12-01; lis_43271a_onyx with samd51j19

Code/REPL

WATCHDOG_ENABLED = True
...
from watchdog import WatchDogMode
from microcontroller import watchdog
...
if WATCHDOG_ENABLED:
    watchdog.timeout = 16
    watchdog.mode = WatchDogMode.RESET
    ...
while True:
    if WATCHDOG_ENABLED:
        watchdog.feed()
    ...

Behavior

This is a preliminary report only, just in case somebody comes across something similar: I have a rather complex program that runs flawlessly. However, when I activate the hardware watchdog, it sporadically stalls (no more console output, no more reaction to buttons, no display refresh, etc.), anytime between seconds and days after starting up. It does react to a keyboard interrupt (ctrl-c) in the console, though. Surprisingly, the watchdog doesn't reset the CPU while the program is stalling. It does so, however, within the set latency period, after a keyboard interrupt. The keyboard interrupt while stalling stops at various positions, but, as far as I can see, always at a line that calls _ticksms()

Has anybody observed something along these lines, too?

Description

No response

Additional information

No response

tannewt commented 2 years ago

I haven't seen this before. ticks_ms does locking by disabling interrupts that may have something to do with this. It shouldn't block a watchdog timeout though.

bludin commented 2 years ago

if ticks_ms were to stall with interrupts disabled, would ctrl-c still work? anyway, I have replaced ticks_ms with a monotonic_ns-based equivalent. No stalls so far, but it's still early to say...

jepler commented 2 years ago

that's odd, ticks_ms and monotonic_ns both use port_get_raw_ticks though ticks_ms passes NULL for subticks that doesn't change anything about locking on atmel-samd (the subticks-is-NULL case just skips over a little bit of arithmetic)

tannewt commented 2 years ago

if ticks_ms were to stall with interrupts disabled, would ctrl-c still work?

No, it wouldn't because the incoming serial is checked for ctrl-c in an interrupt.

If you can catch it stalled, then you can use GDB to see where it is (and hopefully figure out why.)

bludin commented 2 years ago

that's odd, ticks_ms and monotonic_ns both use ...

The link to ticks_ms() isn't corroborated. I only had 5 or 6 stalls in total and for the first two, I didn't even try a keyboard interrupt. I know for the last two that the line contained ticks_ms and I seem to remember this was true for the ones before, too, but I'm not sure. No stalls since I replaced ticks_ms with monotonic_ns, but that might well be luck. And ticks_ms is used in waiting loops, so these lines get called fairly often.

As I said, I placed a bug report mainly to see if somebody else had made similar observations. The problem is so sporadic that it's hard to get a grip on it...