adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.11k stars 1.22k forks source link

Problems with _pew on SAMD21 #3504

Closed deshipu closed 4 years ago

deshipu commented 4 years ago

I have been receiving some reports that the gamepad library and the _pew library don't work reliable on SAMD21 on 6.x. I suspect it's related to changes around the system tick. I didn't have time to investigate this properly yet, but I'm creating this issue to let everyone know that there is a potential problem.

deshipu commented 4 years ago

I'm looking into it now, and it seems like the timer is getting triggered in a very uneven way, which results in flickering.

https://youtu.be/bLqYwwU1rS0

deshipu commented 4 years ago

I connected one of the row pins to a logic analyzer. It should go high for one cycle, and they stay low for seven cycles, at regular intervals. Instead I got this: pin

deshipu commented 4 years ago

I'm stupid, this was a wrong column to check, I will check again with the proper measurement. Still, it looks to be more irregular than it should.

deshipu commented 4 years ago

I did this properly now. I made two analog captures from the R5 pin, which is a row pin. The first one is running 5.3: a

The second one is with 6.0-beta-1: b

You can tell the frequency is much lower and much more uneven. I can send the captures on request, but they are about 100MB, a bit too big to attach here.

deshipu commented 4 years ago

Could this be caused by changes in interrupt priorities?

tannewt commented 4 years ago

What code is setting the pin? Is the pulse width how long the interrupt is occurring? We have to fake the once a ms timing so our math could be bad.

https://github.com/adafruit/circuitpython/blob/main/ports/atmel-samd/supervisor/port.c#L493

deshipu commented 4 years ago

It's an interrupt driven by the GCLK0 timer, the timer setup is here: https://github.com/adafruit/circuitpython/blob/main/ports/atmel-samd/common-hal/_pew/PewPew.c#L97-L115 and the interrupt code is here: https://github.com/adafruit/circuitpython/blob/main/ports/atmel-samd/common-hal/_pew/__init__.c#L36

The interrupt should be quite fast, all it does is changing values of 10 outputs GPIOs and checking the value of one GPIO.

deshipu commented 4 years ago

Note that the frequency being lower is not a problem, but the fact that it is inconsistent causes flickering.

tannewt commented 4 years ago

@deshipu Do you have anything else running concurrently (maybe gamepad). I think the RTC has a slow re-sync process on the SAMD21 which could be causing trouble.

deshipu commented 4 years ago

No, _pew has its own key scanning built in.

tannewt commented 4 years ago

Hrm, I'm not sure what the issue is then. It'll take more digging. Maybe @DavePutz will have a guess.

cwalther commented 4 years ago

I noticed something: When USB is not connected, the flickering almost disappears while othello is thinking and using 100% CPU without ever calling pew.tick(). It also gets less pronounced in maze3d when, judging from the frame rate, it maxes out the CPU and pew.tick(), while regularly called, never needs to time.sleep(). That feels suspiciously like interference from some power-saving feature inside time.sleep(). I haven’t checked the code to see if such a thing exists.

I don’t know why the USB connection makes a difference – the flickering doesn’t start or stop exactly when USB is connected or disconnected, and neither does it seem to depend only on whether USB was connected at boot. It’s not the power supply – battery power, feeding 3.3V into the battery compartment from a lab power supply, or a USB charger without data connection all behaves the same.

deshipu commented 4 years ago

I still didn't do the experiments I mentioned on the meeting (checking PWM and checking just toggling a single pin), but I stumbled upon this code, and I'm tempted to see if commenting it out will change anything: https://github.com/adafruit/circuitpython/blob/main/ports/atmel-samd/supervisor/port.c#L185

Update: it doesn't.

tannewt commented 4 years ago

@cwalther On the SAMD21 you'll see the main clock change when USB is connected because it performs clock recovery from USB. Off USB I've seen the 48mhz be something like 46mhz. The USB interrupt could also be taking time.

tannewt commented 4 years ago

Could you provide a minimal example that only uses the native classes? I'm not sure how to reproduce this.

deshipu commented 4 years ago

This should display three pixels in different colors:


import board
import digitalio
import _pew

rows = (
        digitalio.DigitalInOut(board._R1),
        digitalio.DigitalInOut(board._R2),
        digitalio.DigitalInOut(board._R3),
        digitalio.DigitalInOut(board._R4),
        digitalio.DigitalInOut(board._R5),
        digitalio.DigitalInOut(board._R6),
        digitalio.DigitalInOut(board._R7),
        digitalio.DigitalInOut(board._R8),
 )
cols = (
        digitalio.DigitalInOut(board._C1),
        digitalio.DigitalInOut(board._C2),
        digitalio.DigitalInOut(board._C3),
        digitalio.DigitalInOut(board._C4),
        digitalio.DigitalInOut(board._C5),
        digitalio.DigitalInOut(board._C6),
        digitalio.DigitalInOut(board._C7),
        digitalio.DigitalInOut(board._C8),
 )
buttons = digitalio.DigitalInOut(board._BUTTONS)
buffer = bytearray(8 * 8)
_pew.PewPew(buffer, rows, cols, buttons)

buffer[0] = 3
buffer[1] = 2
buffer[2] = 1

while True:
    pass
deshipu commented 4 years ago

I'm going to try and figure it out over this weekend.

deshipu commented 4 years ago

I have some progress. The above example works perfectly fine, and there is no visible flickering or anything. A version with more pixels lit:

for i in range(16):
    buffer[i] = 3
    buffer[16+i] = 2
    buffer[32+i] = 1

also works perfectly fine. However, if you replace the pass in the while loop with time.sleep(1), you get the flickering. So far so good.

The interesting thing is, you also get the flickering if you use time.monotonic()! That is, something like this:

while True:
    time.monotonic()

Same with time.monotonic_ns(), I'm afraid. (edit: actually that didn't run, because no longint)

So now I'm wondering, I could add independent time-keeping to the _pew module, by having it count the ticks of the timer it uses anyways, and then use that instead of time.monotonic() in the pew.tick(). That would make it work reliably independently of any sleep-related shenanigans. Any other ideas?

cwalther commented 4 years ago

Until we get to the bottom of the issue, that seems more like a workaround than a fix. But I guess that’s better than nothing. Also, programs that use time.sleep() themselves would still get the flickering, such as networksetup.py (which, granted, doesn’t run on CircuitPython).

I’m wondering whether a git bisect would bring some insight, but haven’t spent the effort myself so far.

deshipu commented 4 years ago

We do have a reasonable suspicion that it is related to how the RTC timer is used for time keeping on the SAMD21 with the new sleep code. It is a hack that was necessary due to missing features of RTC (and that is why it's fine on SAMD51, say, not that there are any PewPew devices with a LED matrix with SAMD51). I think that this workaround may be a good idea, because it makes PewPew independent of any future changes around timers and sleep modes, and thus one less thing to worry about.

tannewt commented 4 years ago

I suspect this is the issue: https://github.com/adafruit/circuitpython/blob/main/supervisor/shared/tick.c#L128-L130 It turns off interrupts at a fairly high level. You can comment these interrupt things out temporarily to test. We'll want to push them lower into the critical section of the RTC read instead of here.

This code was there before but I suspect reading SysTick is much faster than reading the RTC.