ivmarkov / rust-esp32-std-demo

Rust on ESP32 STD demo app. A demo STD binary crate for the ESP32[XX] and ESP-IDF, which connects to WiFi, Ethernet, drives a small HTTP server and draws on a LED screen.
Apache License 2.0
782 stars 106 forks source link

DMA SPI for LCD example #140

Open mortiy opened 1 year ago

mortiy commented 1 year ago

Hi, I can't get enough FPS on 320x240 screen. Can you add some DMA example to work with SPI LCD, please?

I've started Reddit thread about it, but still no luck there: https://www.reddit.com/r/rust/comments/zthwzj/low_fps_on_esp32_lcd

georgik commented 1 year ago

Hi @mortiy . I recommend to use this crate for LCD driver https://github.com/almindor/mipidsi Also make sure to boost CPU to 240 MHz and SPI to 60 MHz. There is also DMA support, but we didn't notice gain in frame rate due to some problem in timing of signal on SPI. Here is bare metal examples which has decent frame rate for full screen redraw: https://github.com/georgik/esp32-spooky-maze-game

Other option is to utilize parallel transfer to display, but we do not have Rust example yet.

mortiy commented 1 year ago

Hi @georgik,

Thanks, but my clock already maxed to 240 MHz and SPI to 60 MHz. spooky-maze is exact codebase I'm playing with. Of course for that kind of gameplay it may be acceptable, but if I'll try to put some smooth animation there, then it pretty noticeably slow. I see that just full background clear takes too much time, so I believe I should find the way to make DMA work right here.

I'll try to struggle with it for some more time.

rcloran commented 9 months ago

There is also DMA support, but we didn't notice gain in frame rate due to some problem in timing of signal on SPI.

As far as I can tell this is a problem in the interfaces between display-interface and mipidsi. mipidsi turns every write into an iterator, which cannot be DMA'd -- it always ends up calling the send_data in display-interface(-spi) with U16BEIter.

I've worked around this by initializing my display with mipidsi, and then getting the di back from it:

let mut lcd = mipidsi::Builder::ili9342c_rgb565(...)...;
lcd.clear(Rgb565::BLACK).unwrap();  // Call once to ensure the window is set to the full screen

let (di, _, _) = lcd.release();
let mut dcs = Dcs::write_only(di);

Then I set up an embedded-graphics-framebuf, and can DMA from that:

let mut data: Vec<Rgb565> = alloc::vec::Vec::with_capacity(320 * 240);
unsafe { data.set_len(320 * 240) };
let data: &mut [Rgb565; 320 * 240] = data.as_mut_slice().try_into().unwrap();
let data_bytes: &[u8] = unsafe { core::slice::from_raw_parts(data.as_ptr() as *const u8, data.len() * 2) };
let mut fbuf = FrameBuf::new(data, 320, 240);

loop {
    // draw to fbuf

    dcs.write_command(WriteMemoryStart).unwrap();
    dcs.di.send_data(DataFormat::U8(&data_bytes)).unwrap();
}

I probably need to be more careful about how that fbuf memory is allocated.

On an m5stack core 2 (ESP32, 320x240 ili9342c) I'm able to redraw the whole screen in a little under 50ms, including clearing the entire fbuf and redrawing it. Still not great, but a little bouncy ball animation looks somewhat smooth on it.

I'm not really sure where a real fix should be made. Probably mipidsi? The embedded-graphics interfaces don't lend themselves well to this -- even fill_contiguous can't be implemented for an existing buffer -- but mipidsi could probably implement some extra functions alongside the embedded-graphics traits.

rcloran commented 9 months ago

On an m5stack core 2 (ESP32, 320x240 ili9342c) I'm able to redraw the whole screen in a little under 50ms, including clearing the entire fbuf and redrawing it.

And by changing from fbuf.clear(...) to unsafe { std::ptr::write_bytes(fbuf.data.as_mut_ptr(), 0u8, 320 * 240); }, I save another ~12ms per frame, bringing me to just under 35ms/frame. It's surprising how slow that is.

Assuming no overhead, the fastest I could hope for on this display on a 40MHz SPI bus is around 31ms/frame (1/(40M bits/s / (320*240*16) bits/frame)).

ivmarkov commented 9 months ago

You need double buffering for smooth animation anyway, or else your display will flicker. Which goes to say that the fix should be completely outside of embedded-graphics and inside the driver. Basically you need a dma-powered, as fast as possible, blit operation of a memory region (I.e. a frame buffer or a subset of it) onto the display.