adafruit / Adafruit_ILI9341

Library for Adafruit ILI9341 displays
406 stars 281 forks source link

SPITFT performance on Teensy 3 #43

Closed pljakobs closed 5 years ago

pljakobs commented 5 years ago

I am working on a number of extensions to the Adafruit-GFX library (true color, indexed color canvasses etc) and I have finally updated my development system to the new style of display driver libraries utilizing the SPITFT class instead of re-implementing the basic routines for every single display class. All my testing is done on a Teensy 3.2 using the platformio / MSC environment (so not teensyduino directly)

Now with that new class, I see a considerable drop in performance. Drawing a 40x50 canvas went from ~17ms to ~27ms - now, that encompasses more than just drawing on screen, but the major difference is the update of the display driver to the SPITFT model.

It would seem that the optimizations from the original ILI9341_t3 didn't make it to the SPITFT implementation.

pljakobs commented 5 years ago

doing some A-B Testing with 1.1.0 vs. master now. Seems I'm getting consistently worse performance than with my original setup (which I am not 100% sure what version it was pulled from) but, after a clean re-build, slightly better than I initially reported. (24ms instead of 27ms but the benchmark was 17ms)

pljakobs commented 5 years ago

did some testing with the "graphics test" sketch:

test master 1.1.0 master @40000000 1.1.0 @40000000 master @80000000
Screen fill 887404 1143636 716490 716498 716495
Text 57760 70122 49396 49665 49398 49398
Lines 527009 643539 448308 451909 448324
Horiz/Vert Lines 73164 94412 59211 59444 59215
Rectangles (outline) 46984 60447 38076 38245 38075
Rectangles (filled) 1841950 2373788 1487237 1487275 1487229
Circles (filled) 229409 291924 188784 192553 188783
Circles (outline) 228460 280272 193913 195911 193914
Triangles (outline) 118331 145031 100306 101292 100303
Triangles (filled) 613578 788012 497905 500627 497894
Rounded rects (outline) 101894 126938 85226 86066 85222
Rounded rects (filled) 1838097 2367373 1485283 1486339 1485274

so counter my initial testing, the new version is, if anything, marginally quicker than the old version, however. For reasons I can't quite explain though, when calling display.begin() without a paramter, it performs significantly worse than when calling display.begin(40000000) despite the fact that the default should be 80000000 `#elif defined (AVR) || defined(TEENSYDUINO)

define SPI_DEFAULT_FREQ 8000000`

I assume that the "TEENSYDUINO" define isn't set when using the platformio board definition. However, as the table shows, going from 40MHz to 80MHz doesn't make a big change.

So I'll do some more digging why I'm seeing an impact in my original test sketch.

ladyada commented 5 years ago

thanks for checking, the #defines can be a little sneaky - if you can figure out what we got wrong we'll be happy to fix. there was a lot of massive refactoring. the good news is ... in theory we can add DMA for teensy in here, we have it for the SAMD series!

pljakobs commented 5 years ago

@ladyada I'll see about the #define, I think the compiler provides the cpu type / architecture. And DMA would be spectacular, as what I'm doing is always to copy rectangles from memory. Albeit, I'm working with bitmapped canvasses, so I would have to create an intermediate buffer which is a memory issue once again.

ladyada commented 5 years ago

even for filling rect's it can be quite a time saver pinging @PaulStoffregen - merging the teensy speedups back into this library would then automatically implement DMA speeds into all the displays (ST773x, ILI9341, HX8357 and whatever the future may hold)

PaulStoffregen commented 5 years ago

I will look at this again soon.

Last time I looked into this, the GFX API was lacking semantics necessary to leverage the FIFO control over the D/C pin. I'm pretty sure Adafruit isn't going to be interested in making their API more detailed about the meaning of each SPI transfer, since that would requiring minor updates to all libraries using this base class.

ladyada commented 5 years ago

its pretty much fully-abstracted in SPITFT (which is, ironicalyl, not just SPI and not just TFT, should be morelike "MIPIDisplay") - i dont think any base classes would need updating?

PaulStoffregen commented 5 years ago

Ok, very quick look at the code (sorry, other stuff to do today, so only very brief...)

The abstraction is sadly missing key info needed to leverage a FIFO-based SPI port. Two things are missing.

1: The last write into the FIFO, which will later cause CS to be de-asserted, needs to specified at the time of the write. 2: Each write needs to specify whether it's writing a command or data.

For example, the SPI_WRITE16() abstraction would need to at a minimum become SPI_WRITE_DATA16_CONTINUE() and SPI_WRITE_DATA16_FINAL().

When not leveraging a FIFO, these distinctions don't really matter. The final write is done the same as all the others. But with a FIFO, the last write into the FIFO must be done differently than all the others. The problem is the abstractions aren't detailed enough to make this distinction.

I understand Adafruit has invested a lot of engineering time into this code. I don't wish to speak too harshly here. But to properly leverage a FIFO-based SPI port where the FIFO entries control the signals (which is how ILI9341_t3 achieves its amazing speed), a more descriptive abstraction layer is needed.

ladyada commented 5 years ago

i think its worth making any changes to SPITFT - if you're up for it - to re-unify the graphics libraries! as long as customers don't see a change (which i dont think they will)

pljakobs commented 5 years ago

@PaulStoffregen I second what @ladyada said - especially since many of the larger projects move away from pure Arduino to PlatformIO. Not sure how PlatformIO pulls their cores, but since it can build multi-platform, it seems only logical to have the libraries multi-platform as well.

PaintYourDragon commented 5 years ago

it performs significantly worse than when calling display.begin(40000000) despite the fact that the default should be 80000000

elif defined (AVR) || defined(TEENSYDUINO) #define SPI_DEFAULT_FREQ 8000000

The default is 8 MHz, not 80 MHz.

pljakobs commented 5 years ago

upps, counting zeros doesn't seem to be my forté that also explains why 80MHz doesn't make a whole lot of sense.

ladyada commented 5 years ago

hihi haven't forgotten about this - i think this is best served if you join here https://github.com/adafruit/Adafruit-GFX-Library/issues/205 where we're discussing adding optimizations to the underlying hardware interface - that stuff isn't in this library anymore. so, closin' this one.