almindor / mipidsi

MIPI Display Serial Interface unified driver
MIT License
125 stars 48 forks source link

Make it faster #82

Open svelterust opened 1 year ago

svelterust commented 1 year ago

Currently it is slow.

almindor commented 1 year ago

Could you provide more information please? Slow in which operation exactly and how would you propose to improve it?

svelterust commented 1 year ago

It seems like the draw_continious trait implementations delegate to drawing pixel by pixel. Is it possibile to use hardware specific drawing methods instead? There's also quite a bit of tearing currently, any way to implement double buffering? In my case I use the ili9341 based on this code: https://github.com/almindor/mipidsi/blob/master/mipidsi/examples/spi-ili9486-esp32-c3/src/main.rs.

This library seems to provide super fast drawing for the same screen: https://github.com/vindar/ILI9341_T4

image

svelterust commented 1 year ago

I can definitely help out writing the code, interested to see what we can achieve with this library.

svelterust commented 1 year ago

I can implement following:

Not sure exactly how these work:

svelterust commented 1 year ago

Hardware specific methods like drawing rectangles using hardware commands is also something I could look into.

almindor commented 1 year ago

Feel free of course! Any help is most welcome.

From what I can say these things are doable without major rewrites or other crate changes:

  1. double buffering
  2. differential redraws
  3. vsync/tearing

However Async DMA would require adding support for the transport channel in display-interface and is I guess very MCU specific.

Also for things like diff redraw we need a big enough RAM amount. All such changes need to be feature gated (they could be default) because small RAM MCUs won't support them (e.g. my 16kb RAM one :D)

fu5ha commented 1 year ago

VSync and screen tearing prevention by positioning scanline

I'm not familiar with all supported driver ICs but they're probably pretty similar to the ST7789. There are generally two methods. Both will use a separate input pin on the controller driven by an output from the display controller IC. That pin is usually called "TE" ("Tearing Effect") or sometimes VSYNC. It can be set in two modes.

The first is traditional "vsync" mode. In this mode, the TE line will be low when the display driver IC is reading its internal framebuffer and writing those values to the display. The TE line will be brought high for the duration that the driver IC is idle between frames, not reading its own memory. Thus our main controller mcu must write the full framebuffer into the display driver IC during the time the TE line is high to guarantee no tearing. This is usually a small fraction of a frametime.

The second is I believe what the library you refer to is describing. In this mode, the TE line will be brought high for the same vertical-blank period, but it will also be brought high for a pulse after the driver IC is done reading each scanline (horizontal row of pixels) from its memory and delivering it to the display. In the case of the ST7789 there will always be 320 hblank pulses + 1 vblank pulse per frame. If we keep track of which pulse we're at, we can know which are the valid memory locations our MCU can write into the driver IC's memory at any given time. We can therefore extend the time that we have to write a full frame, by "trailing" our memory writes behind where the IC is reading.

See ST7789VW section on "Tearing Effect" for more (start at page 132) https://www.waveshare.com/w/upload/a/ad/ST7789VW.pdf

fu5ha commented 1 year ago

I'm also interested in helping to contribute some or all of these, fwiw :)

Doing it in a generic way sounds like it might be pretty tough though :P

fu5ha commented 1 year ago

Well, looking closer at the driver you linked and ILI9341, that works differently for actually synchronizing but the idea of what's being accomplished is still as described

fu5ha commented 1 year ago

Another note: IMO buffering and differential redraws are best left to another crate like embedded-graphics. This crate should just provide the ability to efficiently upload subregions of the display in a way that something like embedded-graphics can then leverage... This is already the idea of the embedded-graphics-core DrawTarget trait which this crate implements, I believe.

... that being said, trying to implement those externally and also take advantage of the trailing writes to the driver IC as described above sounds like it might not be possible, so maybe we do need to implement it in the driver.

almindor commented 1 year ago

Another note: IMO buffering and differential redraws are best left to another crate like embedded-graphics. This crate should just provide the ability to efficiently upload subregions of the display in a way that something like embedded-graphics can then leverage... This is already the idea of the embedded-graphics-core DrawTarget trait which this crate implements, I believe.

... that being said, trying to implement those externally and also take advantage of the trailing writes to the driver IC as described above sounds like it might not be possible, so maybe we do need to implement it in the driver.

That's a good point and I agree since those are more "on top of driver" operations.

almindor commented 1 year ago

Wanted to also add, if you could please join #rust-embedded-graphics:matrix.org channel on matrix. There are folks from embedded-graphics as well as display-interface. I feel like it'd be good to discuss some of this stuff in a more general way.

almindor commented 1 month ago

Related issue #142 although this one has a different optimization in mind so I'm keeping both.