Slow bitbanging with python 3.5+

ali1234 commented 5 years ago

In the Blinkt core library time.sleep(0.0000005) is used to time GPIO changes.

The minimum amount of time that time.sleep(n>0.000001) can sleep under Linux is about 60 microseconds on a fast desktop machine, or about 100 microseconds on a Pi Zero. If you ask for time.sleep(0) then the function essentially becomes a no-op, which takes about 0.5 usec on desktop and about 14 usec on Pi Zero.

However, if you ask for time.sleep(0.0000001>n>0) (ie less than 1 usec but more than 0) then the behaviour has changed in Python 3.5+. In earlier versions, this would be treated as 0 and you get the fast no-op sleep. In 3.5+ you get the slower minimum real sleep.

So in practice time.sleep(0.0000005) runs about 7x slower on Python 3.5 than on previous versions. Under Python 2 it was already sleeping for up to 28x longer than you asked for, and now under Python 3.5+ it is sleeping for a total of 200x longer. This means it takes on order of 0.1 seconds to bit bang all the pixels. This causes the examples to run at half speed because they have time.sleep(0.1) in the main loop, and in my testing the larson.py example does not work at all due to this issue.

You can get the same behaviour as Python 2 by simply changing to time.sleep(0) or calling some other no-op/busy loop. Perhaps you don't even need the sleeps at all. Switching to the kernel SPI as suggested on #65 would also solve the problem.

ali1234 commented 5 years ago

Also note that this problem probably affects a bunch of other Pimoroni libs...

Gadgetoid commented 5 years ago

Thank you for detailing this- I suspect if you'd come to me with a surface level symptom and not this in-depth analysis I'd have spent a whole day scratching my head. It's appreciated!

Interestingly the sleeps were only added relatively recently here- https://github.com/pimoroni/blinkt/commit/7a92169bd0b859269e38ec50a001a4f84f027a91

In response to this issue: https://github.com/pimoroni/blinkt/issues/62

It appeared that under certain conditions it's was possible to set and clear the GPIO pins before the change had propagated to hardware. The short sleep ensured a pin state change actually resulted in a physical voltage level change that was picked up by the APA102s.

In this case a switch to SPI would, indeed, fix the problem but the non-standard pins used mean using a dtoverlay and adding a lot of complexity that I'd really like to avoid in what's intended to be a simple add-on board for beginners.

From what I understand, time.sleep(0) might be the right approach in this case since a NOP is effectively what I'm going for in a general sense. Albeit a NOP is infinitely more specific on a microprocessor than it is on any Pi but it looks like - in the case of your times for the Pi Zero - this would result in a total delay of approximately 7.28 milliseconds. This is still pretty slow given that ~20us should be sufficient to update the whole display, but if I don't have a more granular way of specifying delays reliably to avoid #62 then it'll have to do.

This same issue will apply to Rainbow HAT at the very least- https://github.com/pimoroni/rainbow-hat/commit/860b330cfe04c77b3661c29a941207bc91bd6cd3

ali1234 commented 5 years ago

I am now having trouble getting it to work at all with any delay on any python version. Oddly the sleep(0.1) in the larson example prevents it from working. I'll keep investigating this. Might need to get the scope on it.

Gadgetoid commented 5 years ago

Do you have any references for the code behind this change? I did some digging through CPython on GitHub but couldn't turn up anything like a smoking gun. There were a few changes ~4 years ago, which I believe corresponds to the release of 3.5.

On Linux it looks like it leans on select() for a portable method of high-resolution delays.

ali1234 commented 5 years ago

No, I could not find the specific change. It would require a bisect as there are loads of changes.

Gadgetoid commented 5 years ago

larson.py still works for me with python 3.5.3 but has a noticably lower framerate using time.sleep(0.0000005). Using time.sleep(0) does appear to drastically improve it.

Actually measuring this - with the main loop time.sleep(0.1) removed - results in:

python 3.5.3 - time.sleep(0) - ~260 FPS
python 3.5.3 - time.sleep(0.0000005) - ~22 FPS
python 2.7.13 - time.sleep(0) - ~320 FPS
python 2.7.13 - time.sleep(0.0000005) - ~320 FPS

ali1234 commented 5 years ago

That is what I was seeing before, but now everything has gone weird. I'm making a fresh image and re-testing.

ali1234 commented 5 years ago

The weirdness was caused by having them on a 20cm cable. Everything works fine with no cable. The slow sleep just causes slow framerate. Also noticed some other weird stuff not really related to this bug, see discord.

pimoroni / blinkt

Slow bitbanging with python 3.5+ #72