rust-embedded-community / ssd1306

SSD1306 OLED driver
Apache License 2.0
283 stars 69 forks source link

Different GraphicsMode optimization options #124

Closed bugadani closed 4 years ago

bugadani commented 4 years ago

Hi! Thank you for helping out with SSD1306 development! Please:

PR description

In my use case, I'm using the display on an 8MHz SPI connection, which means a flush is around 1ms. This allows me to switch the partial update to a full update, and basically eliminate tracking of the dirty area.

I still need to benchmark this to see if there is any point in adding the extra complexity, but I wanted to open this draft to hear some of your thoughts on the matter.

By default, the implementation uses the old dirty area tracking method with the partial screen update, so it's hopefully backwards compatible.

bugadani commented 4 years ago

Oh, also this is based on #123, hence the duplicate commits.

bugadani commented 4 years ago

Just in case, please don't waste time on reviewing the code yet (since this was thrown together extremely quickly and has a ton of potential issues, including the crash of #123), only the general idea :)

bugadani commented 4 years ago

A speed bump of around 8% (117813 cycles instead of 128348 while drawing a piece of text) when using FastSetPixel. Tested on a nRF52832 @ 64MHz

bugadani commented 4 years ago

Since it's possible to draw to an otherwise unexpected location, I'm thinking about making it possible in FastSetPixel (or a different UncheckedFastSetPixel) to write to an unchecked address. It's already up to the user to draw to the right area, so I think an explicitly unsafe option is not that big deal to add.

I'm also considering moving display size, and possibly rotation to the type level, since it looks like the rust compiler isn't smart enough to optimize them and set_pixel is full of honestly unnecessary jumps. However, this will be a bigger change and I would like to know if you are comfortable with accepting changes like that.

bugadani commented 4 years ago

Closes #122

bugadani commented 4 years ago

I'm done modifying this. There is a single compare that could be optimized away iff it can be ensured that every write is inside the display area, but I don't think it would be worth the extra complexity and the risk.

jamwaffles commented 4 years ago

I like the idea of moving the dirty tracker to a separate struct for cleanliness, as well as the fix for #122, but I'm not sure I see a benefit to making it customisable with another associated type, particularly if we add this one on top of those added in #125. Is 1ms update really too slow for your application?

bugadani commented 4 years ago

My benchmark is my original C code that was able to display a realtime sampled signal at 100fps. While I don't expect the same performance, I'm not even close to that with this patch either :) After #125, most of the set_pixel code is the 4 min/max calls, so yeah, I would like to see those go away.

therealprof commented 4 years ago

100fps? I take it that's not over 400kHz I2C... It seems dubious at least that the min/max would significantly slow down the drawing.

bugadani commented 4 years ago

I'm using 8MHz SPI. Transfering the data takes 1ms, drawing is around 3-4 and there is a lot of processing on top of that - filtering, some calculations to derive data, etc. There must be room for saving to flash memory and bluetooth, so yeah, any bit I can squeeze out of anything helps. That includes drawing as well.

bugadani commented 4 years ago

I don't know why I didn't think of this, but I could just use a tailor-made graphics mode implementation for my use case... Still, it would be nice if the widely available library would be as fast as possible :)

bugadani commented 4 years ago

I'm closing this, because if someone wants their GraphicsMode to be faster, they can reimplement it easy enough and that basically makes the added complexity of this PR not worth breaking everybody's code.