Port to LVGL v7.0 - Githubissues

C47D commented 4 years ago

List of necessary changes:

[x] Update lvgl submodule.
[x] Update lvgl example submodule.
[x] Update lvgl configuration file.

kisvegabor commented 4 years ago

Update lvgl submodule.

It's in dev-7.0 branch

Update lvgl example submodule.

It's in rework-7 branch

Update lvgl configuration file.

It'd be best to create a new one based on lv_conf_templ.h in dev-7.0.

Could you make a build test to see if there are any serious issues?

C47D commented 4 years ago

I will do the tests here: https://github.com/C47D/lv_port_esp32_v7 Once we get it working I will update this repo

C47D commented 4 years ago

First teaser, still some work to do i think

lvgl_v7

kisvegabor commented 4 years ago

What is the size and resolution of the display and LV_DPI?

C47D commented 4 years ago

Size of the screen is 320*240, LV_DPI is 100, now it have the proper configuration (of display size and orientation).

lvgl_v7_2

kisvegabor commented 4 years ago

The demo recognizes if it's a small, medium, large or extra large display and sets the layouts accordingly. This display should have been recognized as small. What is the size of the display (in inches). We can calculate the real DPI from it.

Besides, please modify LV_DISP_SMALL_LIMIT to 25 in lv_conf.h.

C47D commented 4 years ago

The LCD size is 3.2" (inches), ok I will modify LV_DISP_SMALL_LIMIT to 25 and upload a pic.

kisvegabor commented 4 years ago

So the actual DPI is: `sqrt(320^2 + 240^2) / 3.2 = 125. And width of the display is 320/125 = 2.56".

If you set to LV_DISP_SMALL_LIMIT to 25 lvgl will consider > 2.5" display as medium-sized and create this ugly layout. So please set it to 30 (3.0") instead of 25.

C47D commented 4 years ago

Much better

lvgl_v7_3_landscape

kisvegabor commented 4 years ago

Awesome, thank you! I'll recalculate the default size limits.

You can enable LV_USE_PERF_MONITOR in lv_conf.h to see the current FPS and CPU usage.

C47D commented 4 years ago

I'm getting 33FPS, with CPU from 18% to 28% and peaks of 34%.

embeddedt commented 4 years ago

@C47D I'm curious; what FPS do you get at the moment when you scroll past the gauges in the Values tab? I'd like to compare with what I see on STM32F7 (I get 14-18 FPS while scrolling; it increases after that).

C47D commented 4 years ago

Hi @embeddedt, this particular display doesn't have a touch controller, I will try to setup another display to measure the FPS values you've requested.

kisvegabor commented 4 years ago

@C47D I've modified the demo in test/no-tp branch. It changes between the tabs automatically and shows the gauges on the second tab.

C47D commented 4 years ago

Hi, I took a video of the demo using the test/no-tp branch.

https://youtu.be/BJv-vr03RsM

kisvegabor commented 4 years ago

Thank you!

There is still huge FPS drop when the gauge is fully redrawn. At least refreshing the needle only is not that bad.

I have a few questions:

What is the size of the display buffer?
Do you use 2 buffers with DMA in flush_cb?
Have you enabled -Os or -O3 optimization?
What is the speed of the SPI?

C47D commented 4 years ago

Hi, I'm not at home right now I will update my reply with the data you requested.

I had an issue with the demo, the demo stopped working after some minutes and the screen is stuck, I have tried to replicate the issue but it stopped at different times. I will try to add the log functions to lvgl so I can know when it stops.

embeddedt commented 4 years ago

the demo stopped working after some minutes and the screen is stuck

I also saw this happen once on my STM32F7, but I didn't have time to investigate further. It was a few days ago.

C47D commented 4 years ago

Added some printf debugging points to see where the display is getting stuck and I can't replicate the issue, the demo has been running for almost an hour, but noticed that the lv_tick_task is still running.

There's also a new issue on the lvgl repo which uses the same dev board as me and doesn't report the issue Dev-7.0 performance experiments

kisvegabor commented 4 years ago

I also saw the freeze but I thought it's a bug in my hacky display driver. I've already debuged that it stops on a while(vdb->flushing); I still don't know if lv_disp_flush_ready is not called or it's called but as it is called from an interrupt (for all of us) it might mess up things somehow. I mean some assembly level you-never-find-it bug.

I suspect the latest because while(vdb->flushing); reads a bit field which can be quite complicated on assembly level.

kisvegabor commented 4 years ago

I suggest running the lv_demo_stress() which makes much more drawing.

embeddedt commented 4 years ago

I call lv_disp_flush_ready directly from the flush_cb function, so it can't be entirely an interrupt-related issue.

kisvegabor commented 4 years ago

I call lv_disp_flush_ready directly from the flush_cb function, so it can't be entirely an interrupt-related issue.

Ah, good to know. I thought you are using the DMA based flush_cb.

C47D commented 4 years ago

I call lv_disp_flush_ready from a callback when the SPI finished transferring the data via DMA.

C47D commented 4 years ago

@kisvegabor Do you know if it's possible to get a backtrace when the application is stuck? I'm searching some tutorials to debug the esp32 chip with gdb.

embeddedt commented 4 years ago

If you have debug symbols available and can attach to the device with GDB after it's already running, you should be able to interrupt execution, type bt, and get a decently usable backtrace.

kisvegabor commented 4 years ago

It seems @embeddedt knows it much better than me :slightly_frowning_face:

I added some debug code and will run the stress demo all night, and hopefully it'll freeze.

C47D commented 4 years ago

@embeddedt Thanks, I do have debug symbols available, but it's a pain to setup the ESP32 tools on Windows, I was using the WSL but can't run openocd there, so i have to setup the toolchain in Windows directly :/.

kisvegabor commented 4 years ago

I think I found the issue but the picture is not perfect yet.

These are the interesting parts of lv_disp_buf_t. So there are 4 bits next to each other. When going to the next part to refresh lvgl might write the last 3 fields. So what I suspect is:

When e.g. last_area is set, first the whole bitfield is read to a register
An interrupt comes and set flushing = 0
In the register read in 1) the last_area is set but flushing is still 1 too because it was cleared in the "real" variable.
The register is written back and it overwrites the flushing bit.

So it's a typical Read-Modify-Write issue. It all makes sense but @embeddedt said he doesn't use interrupt in the flush_cb. So @embeddedt, are you sure you weren't using an interrupt based driver when it froze for you?

I've pushed a trivial fix. Let's see if it helps.

C47D commented 4 years ago

Thanks for the explanation @kisvegabor, I've set the tools to be able to debug the demo-application (I'm still using the demo-widgets app), I'm running it with gdb so I can backtrace when the application gets stuck.

The "problem" I see is that I we don't know how much time it will take to the application to stall, so how much time should we run the application with the fix to be sure it worked as expected?

embeddedt commented 4 years ago

are you sure you weren't using an interrupt based driver when it froze for you?

100% sure. I don't have access to interrupts in the context that LittlevGL runs in. My setup is a bit unique so it's quite possible that the issue isn't within LittlevGL (although I didn't experience this with 6.1 or earlier snapshots of 7.0).

how much time should we run the application with the fix to be sure it worked as expected?

We have several weeks before release, so I'm pretty sure one of us will run into the issue if it's still present. I'd say let it run for a few hours and if nothing goes wrong, we can consider it's fixed for now.

C47D commented 4 years ago

@kisvegabor I've fixed an error on the lv_examples repo that I found when trying to run the stress demo, I sent a pull request to that repo.

Is this the expected behavior of the demo? https://youtu.be/7goD6lRqTLc

embeddedt commented 4 years ago

I think that's what it's supposed to look like - the idea is that it randomly creates and moves a bunch of objects around to try and trigger weird bugs like this one.

I've had lv_demo_widgets running for a few hours now - it hasn't crashed. I think I can safely say the freeze is gone for me.

C47D commented 4 years ago

I also haven't hit the bug again, I will leave the demo running overnight.

You guys know lvgl better than I do, is there any particular reason why flushing and flushing_last are ints instead of uint32_t?

kisvegabor commented 4 years ago

@kisvegabor I've fixed an error on the lv_examples repo that I found when trying to run the stress demo, I sent a pull request to that repo. Is this the expected behavior of the demo? https://youtu.be/7goD6lRqTLc

Thank you for the PR. Yes, it should look something like this. You can increase TIME_STEP in lv_demo_stress.c (e.g. to 200) to see more.

It was running for me for more than 8 hours. So it really seems to be solved. :tada:

Okay, let's turn back to the original topic: v7 on ESP. @C47D seemingly it's running well on ESP. It could be faster though... On STM32F7, in general, I measured higher FPS with v7 compared to v6, but on ESP for a 320x240 TFT, I'd expect higher FPS. Could you answer these questions, please, to exclude some potential issues?

barbiani commented 4 years ago

@C47D

I do have debug symbols available, but it's a pain to setup the ESP32 tools on Windows, I was using the WSL but can't run openocd there, so i have to setup the toolchain in Windows directly :/.

actually you can or use https://visualgdb.com/

C47D commented 4 years ago

Hi @kisvegabor,

I hope this information can help, if you want me to do some tests please let me know.

What is the size of the display buffer? For this particular display is LV_HOR_RES_MAX * 40, so 320 * 40 bytes, 12.5KBytes.
Do you use 2 buffers with DMA in flush_cb? This is the flush_cb of the display driver available on my dev kit: https://github.com/littlevgl/lv_port_esp32/blob/d124fe22a99580c47b7e45f449faf028390d4353/components/lvgl_esp32_drivers/lvgl_tft/ili9341.c#L150-L177

But we configure with 2 buffers the display buffer: https://github.com/littlevgl/lv_port_esp32/blob/d124fe22a99580c47b7e45f449faf028390d4353/main/main.c#L71-L74

Have you enabled -Os or -O3 optimization?

No, the default project is being compiled without optimizations and debug symbol generation enabled, here are the possible optimization levels available in esp-idf. imagen

What is the speed of the SPI?

The speed of the SPI is 40MHz for this particular driver.

embeddedt commented 4 years ago

Can you try it with -Og and see what happens? That should already be an improvement over no optimization.

C47D commented 4 years ago

@embeddedt the project is being compiled with - 0g, I can try the other options tho if you want me to.

kisvegabor commented 4 years ago

@C47D Just a minor correction: one buffer has 320*40*2 byte = 25 kB (because there is 2 bytes/pixel)

Could you try -O2 and -Os too?

kisvegabor commented 4 years ago

I usually don't use -Og so tested how it compares to others on my STM32F7 dev board:

O0: 7 FPS
Og: 18 FPS
O2: 25 FPS
Os: 24 FPS

So hopefully you also will see ~50% performance boost.

barbiani commented 4 years ago

@C47D

Did you do anything do add the -O2 setting? Mine shows only debug amd release.

C47D commented 4 years ago

@barbiani I'm using the master branch of the esp-idf, maybe is because of that.

C47D commented 4 years ago

@kisvegabor this is the demo stress compiled with -02

https://youtu.be/lTgXsJjYHk8

C47D commented 4 years ago

actually you can or use https://visualgdb.com/

Thanks for the information @barbiani, I ended up running openocd on the Windows prompt and gdb on WSL, but I had to change the drivers for the FTDI chip on windows to be able to connect openocd to it. I debugged the application and it ran for about 5 hours with no issues, so i think the patch @kisvegabor did worked.

barbiani commented 4 years ago

While writing a touch driver I have found that lvgl crashes with touches outside of the screen. Meaning coordinates bigger than x and y resolutions.

kisvegabor commented 4 years ago

I debugged the application and it ran for about 5 hours with no issues, so i think the patch @kisvegabor did worked.

Awesome! :slightly_smiling_face:

While writing a touch driver I have found that lvgl crashes with touches outside of the screen. Meaning coordinates bigger than x and y resolutions.

I'll check it!

kisvegabor commented 4 years ago

While writing a touch driver I have found that lvgl crashes with touches outside of the screen. Meaning coordinates bigger than x and y resolutions.

I'll check it!

Should work now.

kisvegabor commented 4 years ago

@C47D

this is the demo stress compiled with -02

It seems to me it's quite the same FPS as -Og.

Sending 320 240 16bit with 40 MHz takes 30 ms (33 FPS) so it should not be a limiting factor (as it happens in parallel with rendering). However, a 200 MHz MCU should be much faster with a 320x240 TFT. It'd be awesome to see how does it look on an oscilloscope to send an empty screen to the driver (CLK would be enough). I know it needs more effort to do, so I'm just writing it, in case you or @barbiani has time, interest an oscilloscope to measure it. :slightly_smiling_face:

C47D commented 4 years ago

@kisvegabor, sorry if i say something silly but, aren't we limited by the spi clock and not the cpu clock?

I don't have an oscilloscope at home to test it :/, I'm going to the office until next Wednesday so i can test it until then, seems like i need a better logic analyzer :)

Regards

lvgl / lv_port_esp32

Port to LVGL v7.0 #101