lexus2k / lcdgfx

Driver for LCD displays running on Arduino/Avr/ESP32/Linux (including Rasperry) platforms
MIT License
391 stars 55 forks source link

CPU utilisation #80

Open ammaree opened 2 years ago

ammaree commented 2 years ago

We are trying to select a graphics library to use with our ESP32 based project, developed using ESP IDF. Since the display will mainly be used to display text information we have put together a simple test that display a set of information in 40 columns with 25 rows, updated every 1000ms.

After some investigation and tests the options came down to LVGL and LCDGFX. Analysis showed us the following:

LVGL: Very sophisticated and powerful but complex with a large FW & SRAM footprint. Also the codebase is currently "unstable" due to changes v7.x to v8.x. However, a key factor is speed, and LVGL is very fast, performing the test task with ~2% CPU utilisation. LVGL supports the use of larger buffers and recommend 1 (or 2) buffers of 15360bytes for our 3" 320x240 ILI9341 display. On the ESP32 it appears to be using a highly optimised DMA and interrupt driven SPI driver to achieve performance.

LCDGFX: Less complex, powerful enough for our needs, small FW & SRAM footprint. Codebase and API seems stable with no/minimal breaking changes. Only problem is speed with our test task taking ~13% CPU utilisation, or 550% more than LVGL. I guess that the higher utilisation is probably as a result of architectural decisions made to keep the SRAM footprint small, with the ESP32 SPI driver updating the screen programatically on a byte/byte basis.

Q1. Can anybody suggest ways to improve the performance using LCDGFX on the ESP32 ? Q2. Is it viable to change LCDGFX on some level to use larger buffers together with DMA driver SPI bus?

All help and advise much appreciated since we would really like to use this library, but the current level of CPU overhead might prove too much.

Andre

lexus2k commented 2 years ago

Hi Andre,

It would be great if you could provide a simple code example doing the same for LVGL and LCDGFX to use as a starting point for solving the CPU usage problem.

I guess that the higher utilisation is probably as a result of architectural decisions made to keep the SRAM footprint small, with the ESP32 SPI driver updating the screen programatically on a byte/byte basis.

The architectural decision is not the root cause in this case. Basically lcdgfx uses the hardware SPI to send bytes to the display. And the lcdgfx library is at least as fast as the Adafruit libraries that do the same. For example, for a ST7789 135x240 color display, lcdgfx is capable of 30 frames per second.

Can anybody suggest ways to improve the performance using LCDGFX on the ESP32 ?

According to the datasheet of the ILI9341 controller, it is capable of speeds up to 10 MHz, but no more. I agree that 156 KB of buffer is too much for a small ESP32 microcontroller. If LVGL requires such a large buffer as a must have, then I agree that LVGL can be fast. By default, the LCDGFX library uses the direct API, which does not require any in-memory buffers and thus will be slower anyway. But the LCDGFX also supports the use of CANVAS and this can greatly improve performance.

Is it viable to change LCDGFX on some level to use larger buffers together with DMA driver SPI bus?

The only requirement for SPI displays is a DC pin control. Due to the nature of SPI, DC pin must be controlled manually. It should be LOW for the commands, and HIGH for the data. So the ESP32 core needs to manage the DC pin state manually. Other thing can be changed without any issues, even using of DMA

Best regards, Alexey

ammaree commented 2 years ago

here is the code used...

/*
 * gui_lcdgfx.cpp
 */

#include    "gui_lcdgfx.h"
#include    "struct_union.h"
#include    "hal_variables.h"
#include    "FreeRTOS_Support.h"

#include    "lcdgfx.h"
#include    "printfx.h"

//      ESP32-WROVER-KIT              Rst busId  cs     dc  freq     scl  sda
DisplayILI9341_240x320x16_SPI display(
    ili9341GPIO_RESET,
    {   0,
        ili9341GPIO_CS,
        ili9341GPIO_D_C_X,
        26000000,
        ili9341GPIO_SCLK,
        ili9341GPIO_MOSI,
    }
);

#define guiBUF_SIZE 3072
char Buffer[guiBUF_SIZE] = { 0 } ;

void    vGuiTasks(void) {
    const flagmask_t sFM = { .u32Val =  makeMASKVALUE(0, 0, 1, 1, 1, 1, 1, 0, 1, 0x0FFFCF) };
    int iRV = xRtosReportTasks(sFM, Buffer, guiBUF_SIZE);
    iRV += snprintfx(Buffer+iRV, guiBUF_SIZE - iRV, "%!R  %Z\n", RunTime, &sTSZ);
    iRV += snprintfx(Buffer+iRV, guiBUF_SIZE - iRV, "Evt=0x%08X\n", xEventGroupGetBits(xEventStatus));
    iRV += snprintfx(Buffer+iRV, guiBUF_SIZE - iRV, "Run=0x%08X\n", xEventGroupGetBits(TaskRunState));
    iRV += snprintfx(Buffer+iRV, guiBUF_SIZE - iRV, "Del=0x%08X\n", xEventGroupGetBits(TaskDeleteState));
    iRV += snprintfx(Buffer+iRV, guiBUF_SIZE - iRV, "Sys=0x%08X\n", SystemFlag);
        display.printFixed(0, 0, (const char *) Buffer, STYLE_NORMAL) ;
}

void    vTaskLCDGFX(void * pVoid) {
    display.begin() ;
    display.getInterface().setRotation(3) ;
    display.setFixedFont(ssd1306xled_font6x8) ;
    display.fill( 0x0000 ) ;
    while(true) {
        vGuiTasks() ;
        vTaskDelay(pdMS_TO_TICKS(1000)) ;
    }
}
lexus2k commented 2 years ago

Is it complete example? Where are ili9341GPIO_CS, ili9341GPIO_MOSI, TaskRunState, TaskDeleteState, xRtosReportTasks etc. defined? #include "gui_lcdgfx.h" - what's inside this header file? How can I compare to LVGL? Do you have the same example for LVGL library?

If you can provide clear simple example to compare both libraries it would be nice.

P.S. Try to use canvas in your application. That will increase the performance greatly. Canvas example I checked LVGL, and yes they require buffer always (according to https://docs.lvgl.io/master/porting/display.html).

ammaree commented 2 years ago

Apologies, the CS & MOSI definitions are in a header file.

The task values being displayed comes from xRtosReportTasks() and the xEventStatus, TaskRunState TaskDeleteState are simply variables, the values being displayed.

Essentially the content of Buffer[] can simply be any text content. The time taken to fill/update the Buffer[] and the time taken to display Buffer has been measured separately. Screenshot 2022-02-01 at 13 44 18

GUI0 represent the timing for the xRtosReportTasks() & printfx() buffer fill, 4mSec avg. GUI1 represent the timing for the display.fixedPrint() 356mSec avg.

Screenshot 2022-02-01 at 13 42 15

The image above represent the output from xRtosReportTasks() for the same period as the timing stats above. The lcdgfx entry show the time spend on the filling and display of the buffer, with filling being 1.11% and display being 98.9% of the combined 12.98% of CPU time.

IMG_7512

To simplify your test, you can use the buffer refilled with any static text, the actual output if around 100 characters, 40(w)x25(h) We use the exact same buffer content for both LCDGFX as well as LVGL, just the init and display is changed.

ammaree commented 2 years ago

And the pin definitions....

define ili9341GPIO_MOSI GPIO_NUM_23

define ili9341GPIO_MISO GPIO_NUM_25

define ili9341GPIO_SCLK GPIO_NUM_19

define ili9341GPIO_RESET GPIO_NUM_18

define ili9341GPIO_D_C_X GPIO_NUM_21

define ili9341GPIO_LIGHT GPIO_NUM_5

define ili9341GPIO_CS GPIO_NUM_22

ammaree commented 2 years ago

Regarding your suggestion to use your canvas library, I have a couple of questions....

Q1. If I am correct, for the purposes of using the complete 320x240 screen to display text, I would declare a single sprite for the full screen?

Q2. If so, using RGB565 based on 320x240x2 = 153k. Is this correct?

Q3. Since I do not need colour for the text only, is there a way to use a monochrome pallet on the ESP32-WROVER Devote display, and if so, would that r\reduce the memory requirement to 320x240/8 = 9600 bytes?

lexus2k commented 2 years ago

Hi

Q1. If I am correct, for the purposes of using the complete 320x240 screen to display text, I would declare a single sprite for the full screen?

Yes, exactly. You have to declare single sprite

typedef NanoCanvas<320,240,1> MyCanvas;

// For allocation on the stack:
MyCanvas canvas;

// For allocation in the heap
MyCanvas *canvas = new MyCanvas();

Then use just display.drawBitmap1. This will save the RAM for you. Don't forget that you are able to change the color by display.setColor for monochrome sprites.

Q2. If so, using RGB565 based on 320x240x2 = 153k. Is this correct?

The lcdgfx allows using of 1-bit, 8-bit and 16-bit sprites. If you want to have monochrome image, then required size is 320x240x(1/8) = 9.6K, for 16-bit sprites the buffer has to be 153K.

Q3. Since I do not need colour for the text only, is there a way to use a monochrome pallet on the ESP32-WROVER Devote display, and if so, would that r\reduce the memory requirement to 320x240/8 = 9600 bytes?

The answers are above: monochrome sprite + setColor

ammaree commented 2 years ago

Thanks for the feedback, will give the sprites a try,

Just need to figure out best/fastest way to get buffer content (text) into the sprite (bitmap) since in the examples I saw it takes a binary bitmap as input...

Regarding the high CPU utilisation query, how can I help moving that one forward?

lexus2k commented 2 years ago

Just need to figure out best/fastest way to get buffer content (text) into the sprite (bitmap) since in the examples I saw it takes a binary bitmap as input...

All demos for displays have example with canvas:

    NanoCanvas<64,16,1> canvas;
    display.setColor(RGB_COLOR8(0,255,0));
    canvas.setFixedFont(ssd1306xled_font6x8);
    display.clear();
    canvas.clear();
    canvas.fillRect(10, 3, 80, 5);
    display.drawCanvas( (display.width()-64)/2, 1, canvas);
    lcd_delay(500);
    canvas.fillRect(50, 1, 60, 15);
    display.drawCanvas( (display.width()-64)/2, 1, canvas);
    lcd_delay(1500);
    canvas.printFixed(20, 1, " DEMO ", STYLE_BOLD );
    display.drawCanvas( (display.width()-64)/2, 1, canvas);
    lcd_delay(3000);

As I pointed above, please use the canvas.

Regarding the high CPU utilisation query, how can I help moving that one forward?

Just check CPU utilization with Canvas first.

ammaree commented 2 years ago

Apologies for the delay. Have done an exact comparison, all parameters and timing exactly the same. Display example:

Screenshot 2022-02-14 at 23 38 19

Canvas example:

Screenshot 2022-02-14 at 23 31 57

Based on the above, the canvas example takes ~40% of the display example time to refresh, hence a lot better. In the short term, whilst we are using the lcd purely for text display, this performance is sufficient.

Our challenge will however return when we start developing the actual device GUI, in which case the 153k SRAM canvas memory requirements will become a showstopper.

Any plans or suggestions for how we handle this when we get to it?

lexus2k commented 2 years ago

Any plans or suggestions for how we handle this when we get to it?

NanoCanvas<240,320,1> canvas;

This should consume only 240x320x1/8 = 9600 bytes of SRAM. Is this still critical for you? The disadvantage of this method is that you will have only single color for all pixels. So, you can print all text to the 1-bit canvas and then show it on the screen.

By the way, ST7789 controller is much faster on the SPI comparing to ili9341 (10MHz). ST7789 can work at 40MHz of SPI, so you can significantly improve the speed only by changing the display controller to the different one. Of course that depends on the capabilities of your MCU.

PS. If you need to support 240x320 display based in the ST7789, let me know