david-res / ILI948x_t41_p

A basic display driver for ILI948X series on a Teensy 4.1
4 stars 1 forks source link

Speed compared to SPI #2

Open rjonkman opened 1 year ago

rjonkman commented 1 year ago

I'm looking for a ballpark estimate of the throughput for a 16bit parallel configuration. I have a 800x480 RA8875 board. Drawing an entire image (pushing 800x480 uint16_t) takes about 0.5 seconds using SPI, and I'm trying to figure out if it's worth it to try to adapt this driver for that controller. Do you have any insights?

david-res commented 1 year ago

a single write cycle of SPI will sent 1 bit while a single write cycle on 16 bit parallel will send 16 bits. So in theory its 16 times faster, so a full screen write will take 30ms, which is around 30 FPS.

if I read the RA8875 datasheet correctly, it can support a max bus speed of 50Mhz. So the library can do 40Mhz right now. So quick math: 40000000hz/(800*480px)= 104.1fps At a bus speed of 20Mhz it can do 52fps

so it should be much much faster than SPI, I can guarantee that

rjonkman commented 1 year ago

I don't think anything is guaranteed, but i guess it's worth trying out. I configured my RA8875 for 16bit parallel, hooked it up to a Teensy 4.1 and started editing. No luck so far. Quick question. For the RA8875 initialization, I often need to write to a register. I should use SglBeatWR_nPrm_8() to do that? Something like...

uint8_t reg = 0x10
uint8_t data = 0x01;
SglBeatWR_nPrm_8(reg, &data, 1);
david-res commented 1 year ago

Correct, that’s how you would send a command.

rjonkman commented 1 year ago

Cheers, thanks for your input and help. It took me about 3 hours to hook everything up, but I suspect it'll be 3 days (hopefully not weeks) until the first pixel is lit.

david-res commented 1 year ago

Took me about 3 weeks to get FlexIO running(with help from the forum thankfully)! Just make sure all the pins are hooked up correctly and you should be good

rjonkman commented 1 year ago

I noticed that the library doesn't set the READ pin high in the setup method.

david-res commented 1 year ago

You can add a

pinMode(PIN_X,OUTPUT);
digitalWrite(PIN_X, HIGH);

right after FlexIO pin config for the RD PIN number

rjonkman commented 1 year ago

Amazingly, I managed to draw a small rectangle on the screen. It was supposed to be a fullscreen rectangle, but at least I was able to draw something. I spent a lot of time scratching my head, looking at the signals with my scope because there are lots of errors in the RA8875 datasheet with regards to how to implement its communication protocol. Hopefully tomorrow I'll be able to send fullscreen images and I can see what kind of fps I can get out of this thing.

rjonkman commented 1 year ago

So my rough guess is that you can push a 800x480 frame buffer from EXTMEM to the RA8875 in 16 bit parallel mode in about 65ms, running at 8MHZ. That works out to about 15 frames / second.

Trying to push beyond 12MHZ has resulted in display errors. I know there are some limitations to how fast the RA8875 can receive data, but I'll spend some extra time seeing if I can get it going a bit faster. 12 MHZ would be nice.

rjonkman commented 1 year ago

After removing various unneeded microsecondDelay calls (at least unneeded at 8MHZ), I've managed to get over 16.8 fps.

david-res commented 1 year ago

You can try increase the PSRAM bus speed to 133Mhz if you haven’t already to try squeeze out 2-3 more fps. But that’s your bottle neck. You can try fit two partial frame buffers in RAM2 and fill one while the other writes

rjonkman commented 1 year ago

I'll see if speeding up PSRAM helps.

I just tested using the pushPixels16bitAsync(). I only managed to get 11 fps, but the CPU was idling 99.9% of the time, so that will be handy for getting the next frame ready.

Is there anyway to speed that up? It's also not quite drawing correctly , so I'll need to investigate some more.

rjonkman commented 1 year ago

Placing my framebuffer in DMAMEM didn't change the fps, not in the blocking method or the async one, so there's no point in trying to increase the PSRAM clock frequency.

rjonkman commented 1 year ago

I think the reason that the async is taking so long is because it's only working properly for an 8 bit bus, but I'm using 16 bits, so it's actually send way too much data. Is that possible?

rjonkman commented 1 year ago

So after fixing the aync issue, I'm up to 21.18 fps with the MCU idling 99.98% of the time. Awesome! That gives me plenty of resources to render frames and even implement double buffering and differential updating. I've been looking so long for a way to get a high performance UI on good sized LCD. Thank-you so much for making this driver.

david-res commented 1 year ago

Feel free to merge some of your fixes, or even publish a new library for the RA8875+T4.1

rjonkman commented 1 year ago

I'm going clean things up and then separate the flexIO implementation from the rest of the driver. I'll post a library when I'm done.

On Sat., Sep. 23, 2023, 16:37 david-res, @.***> wrote:

Feel free to merge some of your fixes, or even publish a new library for the RA8875+T4.1

— Reply to this email directly, view it on GitHub https://github.com/david-res/ILI948x_t41_p/issues/2#issuecomment-1732255527, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHBO64ML2RIIMIAEL5VQXDX32NWNANCNFSM6AAAAAA453ZOQA . You are receiving this because you authored the thread.Message ID: @.***>

rjonkman commented 1 year ago

So I managed to combine the driver with TGX graphics library, and the results aren't good. Having TGX drawing to a screenbuffer in PSRAM is just way too slow, even when overclocked to 132 MHZ, bringing my effecting fps down to about 5 fps. It's good to learn a bit about FlexIO and parallel ports, but ultimately I should have tested the speed of TGX when writing to PSRAM to start with.

Unfortunately, without a way to place the entire framebuffer in RAM or RAM1, there's not much that can be done to speed things up. Very disappointing. I guess I'll need to wait for Teensy 4.2.

david-res commented 1 year ago

Ditch TGX and use LVGL to render your UI. Use two partial screen sized frame buffers in DMAMEM - you’ll get much better results. On an ILI9488 with DMA on an 8 bit bus I easily get 45 FPS with heavy animations

If you’re up to it, make a custom teensy PCB and expose all of the FlexIO2 pins - they happen to also be the eLCDIF pins (this is what I have done)

rjonkman commented 1 year ago

I'm not sure how I feel about switching to LVGL.

You're saying it will allow me to combine two separate frame buffers into one larger one?

Another approach would be to stick with TGX and break my screen up into sections, each with its own buffer and instance of tgx.

rjonkman commented 1 year ago

If you’re up to it, make a custom teensy PCB and expose all of the FlexIO2 pins - they happen to also be the eLCDIF pins (this is what I have done)

I'm quite capable of making custom teensy's. I'm not familiar with eLCDIF. Is this something that would allow to communicate directly with 40 pin displays, without needing a separate display controller?

david-res commented 1 year ago

LVGL can use one or two partial or full screen sized frame buffers. It handles rotating them so you just give it a callback to write to the display. It will write to the display in chunks when a full screen refresh is needed and if small areas only need changing it will update them only.

yes the 1062 has its own lcd controller on board so you could use it to directly drive a dumb RGB display without a controller in between

rjonkman commented 1 year ago

yes the 1062 has its own lcd controller on board so you could use it to directly drive a dumb RGB display without a controller in between

Interesting. Probably won't be useful for 800x460 until the amount RAM increases a bit yet.

I briefly looked at LVGL and unfortunately it's missing a few things I need (drawing Bezier curves for example), so I will stick with gtx. I have a a lot built up on top of gtx, and it works in a similar fashion as LVGL., only updating the parts of the screen that need refreshing. I will borrow their idea, however, and render the screen in sections. As long as my buffer is bigger than the largest section, it should work quite nicely.

rjonkman commented 1 year ago

I have things running very snappy now, using a 700x320 render buffer in DMAMEM, and 800x480 buffer in EXTMEM for figuring out which portions of the screen need updating. With heavy animation I'm pushing about 20 updates / second, and light animation around 60 fps. Pretty pleased with that.

Next, I will borrow that idea from LVGL and make it so I can render parts of the screen at a time.

rjonkman commented 1 year ago

One last post from me on this matter. I've since switched to a SSD1963 based TFT. It's still 5" display, 800x480. It runs more than 2x as fast the RA8875.