pimoroni / blinkt

Python Library for Blinkt; 8 APA102 LEDs for your Raspberry Pi
https://shop.pimoroni.com/products/blinkt
MIT License
314 stars 103 forks source link

Extremely erroneus/unreliable display under load #62

Closed Moonbase59 closed 6 years ago

Moonbase59 commented 6 years ago

I wanted to use blinkt! on a Raspberry 3B as a status indicator and have been debugging a few days already. Now — by pure chance — I re-ran the examples again since I was at my wit’s end and there seems no error in my code.

I found that NONE of the examples run correctly! All of them randomly display wrong colors, light the wrong leds, or flash with full brightness IF the system gets a little load (i.e., by running a Chromium that is used to display a local webpage).

I already eliminated all "the usual suspects", i.e.

Unfortunately, my smartphone isn’t too great taking videos, but I thought you should see it: Output of 'examples/graph.py': http://kaufen-ist-toll.de/download/radio/20180521_011933_blinkt_graph.mp4 Output of 'examples/rainbow.py': http://kaufen-ist-toll.de/download/radio/20180520_232920_blinkt_rainbow.mp4 Output of 'examples/morse_code.py': http://kaufen-ist-toll.de/download/radio/20180520_233013_blinkt_morse_code.mp4

I really need some help here, please!

I suspect that either the blinkt or the GPIO library is somehow time-critical or not thread-safe or such, because it works for minutes without a glitch when the system has nothing else to do. But when some other processes generate a little more load (like the X server and Chromium browser doing background animation for my "status buttons") everything goes beserk. (The load index varies between about 0.4 and 1.5, depending on how many CSS-animated buttons Chromium has to display.)

If you are interested, here’s another video that shows a) the display (with only the "blue" button/LED blinking), b) a Zero W sitting next to it (seemingly ok), and c) down below the Pi 3B that drives both the display and its internal blinkt! (with a few glitches, like LEDs suddenly flashing 100% bright or in the wrong color): http://kaufen-ist-toll.de/download/radio/20180518_114005_blinkt_irregular.mp4

I can run the exact same (Python) software on a freshly installed Pi Zero W (but without X and without Chromium, because the Zero isn’t able to handle that load) on a Raspbian Lite image, seemingly without a glitch. When simply swapping the Zero’s SD card into the Pi 3B, it also seems to work (I suspect because there is almost zero load).


To help diagnosing, here is a little more info if it helps:

The blinkt! library is at version 0.1.1 and been installed using the bash script. RPi.GPIO is version 0.6.3.

Python shows:

Python 2.7.13 (default, Nov 24 2017, 17:33:09) 
[GCC 6.3.0 20170516] on linux2
$ lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 9.4 (stretch)
Release:    9.4
Codename:   stretch
$ uname -a
Linux studiodisplay1 4.14.41-v7+ #1113 SMP Thu May 17 16:29:48 BST 2018 armv7l GNU/Linux
$ cat /boot/cmdline.txt 
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=932d315f-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait
$ cat /boot/config.txt 
# Enable audio (loads snd_bcm2835)
dtparam=audio=on
# enable raspicam
start_x=1
#gpu_mem=128
dtoverlay=vc4-kms-v3d

(only the entries not commented out)

Gadgetoid commented 6 years ago

I can't get anywhere near the level of distortion you're seeing, but I can cause the Blinkt! to glitch with high CPU load.

This is odd, because the APA102 protocol is basically just a shift register, it's a data/block signal that's not at all timing critical. It's almost like the system is simply ignoring GPIO commands occasionally.

Even at 95+% on a Pi 3B+ I can't get close to the distortion you're seeing, though, but a high CPU load shouldn't affect what's being output at all. Hmm.

Gadgetoid commented 6 years ago

Okay, it was an obvious problem.

GPIO.output(pin, state) does not guarantee that pin has been asserted to state when it completes. It's an atomic action that simply switches a bit in a register on the Pi and has nothing to do with the physical hardware.

Under heavy enough load it's possible - apparently - to set and clear the pin before the change has even propagated to hardware. This effect was compounded by a total lack of sleep- I'm not sure how system threading and Python applications play together, but the problem can be fixed with just one time.sleep(0.000001). I'll add a patch for this that you can try out.

Gadgetoid commented 6 years ago

You should be able to try the latest version by:

git clone http://github.com/pimoroni/blinkt
cd blinkt/library
sudo python setup.py install

As a reminder to myself, I should apply this fix to Rainbow HAT too: https://github.com/pimoroni/rainbow-hat/blob/master/library/rainbowhat/apa102.py

Moonbase59 commented 6 years ago

Gee, thanks for the reply, I’ll try that tomorrow …

Moonbase59 commented 6 years ago

Reminder to @Gadgetoid: And maybe the Unicorn HAT & pHat? :-) (Because the Unicorn pHAT is the next one I have lying around here …)

Moonbase59 commented 6 years ago

Feedback: First try — it seems to work. Unbelievable. Wonder if I can now refactor out all my furious threading Locks again … ;-)

I’m astonished but very grateful. Cheers a lot for helping, I was almost despairing here … :-)

Maybe you’re right, I seem to remember that Python threading gives up control on a) sleep, b) I/O wait, or c) every so many bytecodes in Python 2, every 15ms in Python 3 …

Moonbase59 commented 6 years ago

Here’s proof: Blinkt working as intended: http://kaufen-ist-toll.de/download/radio/20180522_054355_blinkt_patched.mp4 Great job!

The Python stuff now eats 1.7 + 0.3% CPU ("top" values, including the extra thread). I’m still disappointed that Chromium uses up 47+13=60% plus Xorg happily chunking away another 91% — just for 2 small CSS3 background animations w/ 5 keyframes each … But that, of course, is an entirely different problem …

Many thanks again!

Moonbase59 commented 6 years ago

I think we can close this. Apart from running all examples again, I wrote some testing routines today, and all works great both on a RPi 3B and a Zero W under heavy load.

On the 3B, I have a Chromium running a (JS) MQTT client, a mosquitto MQTT server, 4 Python scripts (a phone call monitor, a radio stream reader, a weather interface and the signal handler for the blinkt!), plus my testing code that (besides other tests) fires MQTT messages at the signal handler every 100ms. (Same setup on the Zero, except for X server and Chromium.)

All works as expected now.

dglaude commented 6 years ago

I found this (one of my repository): https://github.com/dglaude/blinkt-pigpio It is a pigpio version of Blinkt! library, and all your code should continue to work with your blink hardware and this library. I would be interested to know the behaviour under heavy load, as pigpio might handle that differently. Also it permit remote Blink! is you want to run your code on the Pi3B and display on the PiZero.

Please let me know if it is useful and working better than the non fixed Blinkt! ;-)