Closed joukos closed 4 years ago
Tested with PaperTTY & VNC on Waveshare 9.7" with IT8951 :
Video Waveshare 9.7" eink displaying Midori browser with PaperTTY
Wow, that's beautiful, thanks!
@math85360, is it okay to add the image (and link to the video) to the README?
Yes, of course !
Very nice, hope the 7.8 inch works well also
I'm getting a 7.8 inch screen soon, I'll report back if everything works fine. It should arrive in about a month.
@jmi2k cool, thanks :) Good luck!
Here is a terminal on the 7.8 using a raspberry pi 4 running archlinuxarm.
The CPU usage goes up to 100% periodically and even partial refreshes only happen every two seconds or so. Is that to be expected? The only thing I changed was the VCOM to match my display (-1.38V).
Right now the only SD card I have is 2GB so I'll have to wait until next week.
Thanks for letting us know @jeLee6gi !
The CPU usage goes up to 100% periodically and even partial refreshes only happen every two seconds or so. Is that to be expected?
Well, I haven't actually tried PaperTTY on a Pi4 and it's very likely there's room for optimization to at least make better use of the multiple cores. Do you use it in TTY or VNC mode? I don't think it should take very long to process an updated frame, but with a display that big, simply sending the data to it might take a while. With partial refresh there shouldn't be too much data to send though...
I've done some profiling, because I use a Pi Zero (with a 6" display) and the problem is even more pronounced there. I only use TTY mode. It seems pack_image in driver_it8951.py is rather slow. I've done some optimization, but still need to test. I'll share it when I'm confident that it still works.
@chi-lambda thanks, performance improvement is always nice, hope it works. I wish we had some unit tests though so it would be easier to avoid anything breaking... ;)
Well, I haven't actually tried PaperTTY on a Pi4 and it's very likely there's room for optimization to at least make better use of the multiple cores. Do you use it in TTY or VNC mode? I don't think it should take very long to process an updated frame, but with a display that big, simply sending the data to it might take a while. With partial refresh there shouldn't be too much data to send though...
I only used TTY mode so far
I've done some profiling, because I use a Pi Zero (with a 6" display) and the problem is even more pronounced there. I only use TTY mode. It seems pack_image in driver_it8951.py is rather slow.
I also tried to profile:
import time
for i in range(1_000_000):
print(i)
time.sleep(1)
I ran this program run until it started scrolling the terminal and then profiled with py-spy for two minutes. Whenever spi_write
was running, py-spy had trouble getting samples so it realistically it spent even more time in spi_write
than it recorded. Anyways, this is how it spends its time and when it's scrolling like this I get a new image roughly every 8 to 15 seconds.
I used cProfile. Sadly, testing will have to be postponed as my Raspberry Pi seems to have turned into a Raspberry Fry. I'll push the code to my repository later though.
Ultimately, I think it would be ideal to reimplement parts of PIL/Pillow to directly write the output format.
Check out pull request #38.
Alright, I got out my spare Pi 1B and the effect is even more dramatic.
Configuration: 6" display, 800x600, configured for 72 columns and 27 rows, using Terminus 11x22 as a pil font.
Ran "lorem -p5" twice, essentially triggering a full display refresh every time.
Average run time for pack_image with old code: 14.7 seconds. Average run time for pack_image with new code: 1.4 seconds(!!)
Still not really enough for fluent typing, but it's a start.
That's a great speedup! But will this produce the same output as the old code and will it work with the other displays too?
Purely visually speaking, the output is fine. The three changes I made are:
The code makes the bold assumption that the image has a number of pixels divisible by 4, but I seem to recall that it's always padded to a multiple of 8 anyway.
There's still a lot of potential for further speedup by breaking up updates into smaller pieces, but this would have to be done in papertty.py. I think finding the optimal subdivision is NP-hard though.
The changes only affect IT8951 devices.
Okay, let's hope there's no quirks and I'll merge it since it's such a significant boost. I added a tag for the old code (v0.03_unoptimized
) in case it doesn't work for someone.
Thanks a lot for this contribution!
I will soon test also on 9.7 inch, nice to see these efforts to enhance the performance, thank you
I found a little wrinkle in my new code: It assumes that the input is 1bpp. That's true for TTY (which could be reviewed, incidentally), but not for VNC. Maybe we should create two different draw methods or a branch in the existing one.
Hmm, well that's a bit bad for the VNC feature... I can't really test it myself or have time to spend on guesswork right now, so if you or someone else is willing to implement some quick fix so that it uses the optimized one for TTY only, it would be very appreciated :)
Best would of course be that the draw method is optimized for both cases, but that can be done gradually. Again, thanks for your effort on this.
Slightly unrelated question, but do you guys know if the IT8951 Driver Hat from Waveshare has special firmware/settings in the SPI flash for the different panel sizes/resolutions?
I've only seen a function that reads the panel size from the controller but none to set it. Maybe it's stored in the flash?
Perhaps some of you have different panels on hand and can try if they work with the same IT8951 board? Or even dump the SPI flash from different boards and compare?
@joukos The packing algorithm can be sped up a lot (about 100x) by implementing it in C. Would you consider that or do you want to keep it pure Python?
The effect is less noticeable on a Pi4 (.2* vs .002 seconds) than on older models and Zeros, where rendering times can be in the range of seconds. Right now, my code spends the most time (.3 seconds) on loading the image into an array, which could potentially be optimized more; and sending the image to the device (usually 1, but sometimes up to 1.5 seconds for 800x600; much shorter for smaller updates), which is pretty much out of our hands.
* .2 seconds on an already optimized version of the code. The one in the linked commit runs much slower.
Can numba do this? Maybe this is easier than having to build C code.
I'd prefer to keep things Python and as simple as possible, unless there's a big enough gain to justify any extra dependencies or complexity. I think the fact that the program is in Python makes it easier for others to quickly get to know the code and improve it in its current early stages. In the end the most limiting factor is simply the speed of transmitting image data to the display via SPI, and yeah, there's not much that can be done about that with the current supported displays.
That said, the processing part can and should be improved where applicable, but I'd say that the gains need to be fairly significant to justify spending too much time on them at this time. If a whole second out of a 3 second refresh time can be shaved off by simply better code, that's great, but for example needing to add Fortran and NumPy as dependencies to shave a 100 ms of the same 3 seconds is not worth it in my opinion. Lean and mean is preferred.
I'm not saying it's bad to optimize the bit twiddling if one is willing to put effort into it, but I think the usability issues are more of a priority, and if we want to go very low level with this, might as well turn it into a kernel module and at that point, we'd be working for Waveshare maybe ;)
Thank you both for the code you've provided, I'll try to get them merged as soon as possible, personal life is just getting seriously in the way right now (though in a good way).
On Wed, 15 Jan 2020, 12:42 joukos, notifications@github.com wrote:
I'd prefer to keep things Python and as simple as possible, unless there's a big enough gain to justify any extra dependencies or complexity. I think the fact that the program is in Python makes it easier for others to quickly get to know the code and improve it in its current early stages. In the end the most limiting factor is simply the speed of transmitting image data to the display via SPI, and yeah, there's not much that can be done about that with the current supported displays.
That said, the processing part can and should be improved where applicable, but I'd say that the gains need to be fairly significant to justify spending too much time on them at this time. If a whole second out of a 3 second refresh time can be shaved off by simply better code, that's great, but for example needing to add Fortran and NumPy as dependencies to shave a 100 ms of the same 3 seconds is not worth it in my opinion. Lean and mean is preferred.
I'm not saying it's bad to optimize the bit twiddling if one is willing to put effort into it, but I think the usability issues are more of a priority, and if we want to go very low level with this, might as well turn it into a kernel module and at that point, we'd be working for Waveshare maybe ;)
Thank you both for the code you've provided, I'll try to get them merged as soon as possible, personal life is just getting seriously in the way right now (though in a good way).
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/joukos/PaperTTY/issues/32?email_source=notifications&email_token=AFKZ2J5GSCH3SG6CUGC3EULQ53SC7A5CNFSM4JIJ4BO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI734QA#issuecomment-574602816, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFKZ2J5HILX44JEDJXF745TQ53SC7ANCNFSM4JIJ4BOQ .
@jeLee6gi How did you install py-spy
? I tried and it doesn't work:
(papertty) pi@raspberrypi:~ $ pip install py-spy
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
ERROR: Could not find a version that satisfies the requirement py-spy (from versions: none)
ERROR: No matching distribution found for py-spy
@gdkrmr Good question! I was going to compile it first (py-spy is mostly written in rust) but then I got tired of satisfying compilation dependencies and just downloaded the armv7 binary from their github releases page. :grin:
That py-spy looks pretty neat, I should try it out too. Side note about the compilation issues: shouldn't cargo install py-spy
handle all the (Rust) dependencies automatically or are there some problems building it on arm? For what it's worth, I got it installed on Ubuntu 18.04 which has a Rust environment installed with Rustup by doing:
sudo apt install libunwind-dev # at the very end, the linker complained about not finding this
cargo install py-spy
I can confirm the 10.3 inch screen works fine. Still setting it up, but I want to use it as a daylight-visible chart plotter on a sail boat.
@C-Rothnie thanks for letting us know (and for the nice image)!
@C-Rothnie
What's the update speed like on that huge display?
I want it slow in fact - I am happy with a 5 second update for my sailing application. Things don't change much faster than that on the water. The testing I have done is with a 1 second sleep and even so, it does a partial refresh on the changed area after a second or two. I am using a Raspberry Pi 4. I don't intend to use the e-ink screen in an interactive mode very much - just display marine charts, boat position, other nearby boats etc in the chart plotting application OpenCPN. I will give further feedback after I have finished setting it up.
Cool! I seem to remember taking a look at OpenCPN a few years ago and thought it would be ideal to have an e-ink with it, but back then I didn't have such a display (and they weren't really available anyway). I'm interested in knowing how your project turns out!
@gdkrmr Good question! I was going to compile it first (py-spy is mostly written in rust) but then I got tired of satisfying compilation dependencies and just downloaded the armv7 binary from their github releases page.
Thanks, works now! I didn't realize that this was a standalone program, I just assumed that this worked somehow like the debugger, python -m pdb ...
.
Just for the record: I had to run sudo py-spy record -o profile.svg --pid xxx
, because running python as a child process, py-spy -o profile.svg -- ~/.virtualenvs/...
, crashed, left the child process alive, and the display unusable until killing the python process manually.
Right now, the SPI uses a speed of 2 MHz (search for self.SPI.max_speed_hz
in driver_it8951.py
), which I guess was chosen rather arbitrarily. I've managed to raise it to 18 MHz (20 doesn't work), and hoo boy does that speed up the transfer step. Could you other IT8951 owners test what works for you?
@C-Rothnie @jeLee6gi @math85360
Right now, the SPI uses a speed of 2 MHz (search for
self.SPI.max_speed_hz
indriver_it8951.py
), which I guess was chosen rather arbitrarily. I've managed to raise it to 18 MHz (20 doesn't work), and hoo boy does that speed up the transfer step. Could you other IT8951 owners test what works for you?@C-Rothnie @jeLee6gi @math85360
I did some testing with the 4.2 inch monochrome display and times seemed the same, https://github.com/joukos/PaperTTY/pull/40#issuecomment-578510436, maybe I did something wrong. I could crank it up to 40MHz on a Pi4.
I was also wondering if perhaps there was something wrong with either the measurement or actually setting the speed, since the flamegraphs seemed peculiarly near-identical with the 4.2"...
In any case, great to hear that at least the IT8951 may benefit from that! @chi-lambda, which Pi version (assuming a RPi) did you use and can you give some rough numbers on the speed-up?
I was also wondering if perhaps there was something wrong with either the measurement or actually setting the speed, since the flamegraphs seemed peculiarly near-identical with the 4.2"...
It did "something", because it actually started failing at 50MHz, maybe it's the display.
I tested on a Pi Zero W, speed-up from 1–1.5 seconds to about .2–.3 seconds for a full (800x600) update. The initializing update is ridiculously slow (over 10 seconds at 2 MHz) for some reason, but also scales about (inverse) linearly with frequency. Pi Zero has so far been about 50 percent slower sending date than my Pi4, which is also just about the ratio of the CPU frequency. Haven't checked the higher frequency on the Pi4 yet.
In unrelated news, there's a new IT8951 display: https://www.waveshare.com/6inch-hd-e-paper-hat.htm
Well, that's certainly a huge speedup, especially for a measly Zero. We've come quite far from the early video in the first comment in this issue, where a Zero W seems to struggle quite a bit to update the display (though it's probably bogged down by the browser too)...
In unrelated news, there's a new IT8951 display: https://www.waveshare.com/6inch-hd-e-paper-hat.htm
I ordered one a week ago ;)
I've been meaning to mess with the clock speed and other SPI related things because in my measurements it looked like most time was spent in the SPI library. I was going to follow this blog post which has lots of technical stuff about how to efficiently use SPI.
If I remember correctly, the only thing I tried back then was to remove the max_transfer_size
and the loop that calls SPI.writebytes
on each chunk and replace it with a big numpy array containing the frame and SPI.writebytes2
which resulted in ~60% fewer writes. It seemed to help but I didn't test it too much.
After letting it run for about half a day, I wasn't able to restart it at higher frequencies. 8 MHz would work, but not 12 or more. Restarting the Pi seems to have fixed it. Just something to keep in mind.
I finally got an IT8951 display of my own (6" HD) and it works too. For anyone interested, a couple of poor quality pics until I have a chance to try it out some more:
(The Blake Stone window there is 640x400...)
Since the IT8951 support seems to be working pretty good and the boxes are ticked, I'll close this issue now. Thanks all!
Hello, this project is amazing! Trying to use my 7.8 inch 1872x1404 Waveshare display as a sunlight-readable dashboard, logger, and interface to retune the motor controller on my DIY electric bike.
Haven't been able to get it to work so far. I'm using a raspberry pi 4B, raspbian lite (will eventually move to piCore), and x11vnc.
However, even sudo papertty --driver IT8951 scrub
results in no change on the new display, but it seems to identify it, returning:
width = 1872 height = 1404 img_addr = 00122520 firmware = SWv_0.2. lut = M841_TFA2812 VCOM = -2.00V
Any idea where to start debugging?
There should only be one application controlling the display at a time. What's this other driver you are running?
I had installed BCM2835 libraries and IT8951 drivers as described in Waveshare's wiki. I uninstalled them, rebooted, and tried again-- get the same CLI output but nothing on the display. I'll flash a new raspbian image and try again, maybe something is left behind.
Comment on this issue for your experiences with the IT8951 support.
Tested: