Pause/Resume screen and widget refresh?

lvgl / lv_binding_micropython

LVGL binding for MicroPython

MIT License

250 stars 161 forks source link

Pause/Resume screen and widget refresh? #193

Closed jdtsmith closed 2 years ago

jdtsmith commented 2 years ago

I've found that the timer-based synchronous hybrid micropython driver performs better on ESP32 than the async version. But when setting many display settings for various elements on screen, the update is relatively slow and clunky as elements are moved, text/colors/alignments/opacity/etc. are changed, etc. — all visibly. This is likely because the time to perform the updates is long compared to the interrupt-driven timer tick interval. Changes are not "coalesced".

I was surprised to find that LVGL didn't itself offer a pause/resume feature to pause refreshes (unless I'm missing it) while various on-screen elements are reconfigured.

So I hacked one together by abusing the scheduled and max_scheduled event loop parameters:

def pause():
    global paused
    if disp and not paused:
        disp.event_loop.scheduled = disp.event_loop.max_scheduled # Stop lv.task_handler() calls
        paused = True
        return True
    return False

def resume():
    global paused
    if disp:
        disp.event_loop.scheduled = 0 # Resume them
        paused = False

So I can:

display.pause()
#many LVGL widget updates
display.resume()

This really smooths out changes to widgets on screen, and improves speed. But is there a cleaner way to do this? If not, would it be sensible to implement some pause/resume logic in the hybrid drivers?

embeddedt commented 2 years ago

LVGL doesn't offer that feature because in C it's usually not needed. Normally you'd be doing all of your update work inside blocking event handlers, which have to finish before the display is redrawn.

(The display is also only redrawn once every 15-30ms or so, based on configuration, and usually user update logic doesn't take more than a frame.)

On Sun., Oct. 17, 2021, 3:25 p.m. JD Smith, @.***> wrote:

I've found that the timer-based synchronous hybrid micropython driver performs better on ESP32 than the async version. But when setting many display settings for various elements on screen, the update is relatively slow and clunky as elements are moved, text/colors/alignments/opacity/etc. are changed, etc. — all visibly. This is likely because the time to perform the updates is long compared to the interrupt-driven timer tick interval. Changes are not "coalesced".

I was surprised to find that LVGL didn't itself offer a pause/resume feature to pause refreshes (unless I'm missing it) while various on-screen elements are reconfigured.

So I hacked one together by abusing the scheduled and max_scheduled event loop parameters:

def pause():
global paused

if disp and not paused:

    disp.event_loop.scheduled = disp.event_loop.max_scheduled # Stop lv.task_handler() calls

    paused = True

    return True

return False
def resume():
global paused

if disp:

    disp.event_loop.scheduled = 0 # Resume them

    paused = False
So I can:

display.pause()

many LVGL widget updates

display.resume()

This really smooths out changes to widgets on screen, and improves speed. But is there a cleaner way to do this? If not, would it be sensible to implement some pause/resume logic in the hybrid drivers?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lvgl/lv_binding_micropython/issues/193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKHTVAHF6IG2IAVFQB7HUVDUHMPI7ANCNFSM5GFER3XQ .

jdtsmith commented 2 years ago

Thanks, that makes sense. Letting asyncio handle the event loop would also have this characteristic. It's easy to overwhelm the one asyncio event loop in MP though, so I've found timer-interrupt performs better.

By default micropython hybrid drivers update every 40ms. With slow displays and some interface niceties for calling back into the C-code, it seems updates can span ticks. Once one tick gets "caught in traffic", the slowness of the display update likely means many will. But I believe an option/config in the hybrid driver to avoid calling task_handler() during some period would do the trick. That's effectively what my hack does.

amirgon commented 2 years ago

But when setting many display settings for various elements on screen, the update is relatively slow and clunky as elements are moved, text/colors/alignments/opacity/etc. are changed, etc. — all visibly. This is likely because the time to perform the updates is long compared to the interrupt-driven timer tick interval.

If you are experiencing slowness it may be worth checking why.
LVGL Python API is relatively efficient so just calling LVGL functions should be pretty fast.

Does it happen when the Python module is frozen?
Does it happen when you decorate your functions as native?

Specifically, beware of high frequency Python callbacks.
For example, avoid passing lv.EVENT.ALL to add_event_cb since LVGL generates lots of events for each object.

So I hacked one together by abusing the scheduled and max_scheduled event loop parameters:

I think you have a bug there - when you set disp.event_loop.scheduled to a fixed value you overwrite the value already there and as a result the event loop may lose track of the number of events currently scheduled. It's better to increase/decrease it instead. This would also allow nesting multiple pause/resume calls.

Instead of a workaround, I've added these enable/disable functions to the Event Loop here: https://github.com/lvgl/lv_binding_micropython/commit/bd1a12c8575083dd333373b4e46d6487988d4424

https://github.com/lvgl/lv_binding_micropython/blob/bd1a12c8575083dd333373b4e46d6487988d4424/lib/lv_utils.py#L102-L106

jdtsmith commented 2 years ago

Thanks. I have a flag which, when set, refuses to "disable again". But great idea for nesting calls! That will simplify things. It might also then be possible to wrap this in a context manager (with lv_utils.disable_refresh: or similar) and not worry about nesting.

RE speed, I just assumed ESP32/uPy and my slow SPI display (ST7789) were the issue. I do tend to update maybe 15-25 attributes for a few widgets on screen in one go, including positions, font size, colors, alignment, recolor, text, bg image source (which reads a .bin from slow flash). Would you expect this to happen all within one tick on an ESP32? To me it wasn't surprising that this takes some time; loading one 240x135 .bin takes maybe 300-400ms. And since I'm not double-buffering, it's also not perhaps surprising that once one tick gets passed, LVGL tries to sync the display to the current "half-finished" layout, additional ticks arrive, and so on. I haven't tried freezing or native code (but freezing commonly updated code does defeat one of the main strengths of MPY IMO!).

I have overloaded the normal API with custom __getattr__ and __setattr__ which translate some values and makes it much simpler to work with LVGL from micropython (see this gist if interested). I wonder if that's adding much overhead to the pure set_style_xxx_yyy() calls.

I'm not using any event callbacks, it's really time-based and button-driven "scene" changes that were quite disjointed without a disable/enable wrap, and work pretty well with them.

If you are interested I could attempt a small demo case that illustrates changes spanning a tick.

amirgon commented 2 years ago

I do tend to update maybe 15-25 attributes for a few widgets on screen in one go, including positions, font size, colors, alignment, recolor, text, bg image source (which reads a .bin from slow flash). Would you expect this to happen all within one tick on an ESP32?

ESP32 is actually pretty fast. It runs on 160 / 240 MHz so even when running interpreted language like Micropython I would expect it to complete updating 15-25 attributes probably in less than a millisecond.
Reading from Flash is another story. Regardless of Micropython or C, it can definitely take time. But I would expect it to completely load before being displayed so I don't see why it would cause a "half-finished" layout.

I have overloaded the normal API with custom __getattr__ and __setattr__ which translate some values and makes it much simpler to work with LVGL from micropython (see this gist if interested). I wonder if that's adding much overhead to the pure set_style_xxx_yyy() calls.

It would be interesting to try updating these attributes without your overloaded getters/setters.
From your gist another thing stands out - there are lots of string operations. You are composing strings and pass them to hasattr and getattr, and you are using F-strings which are very useful but I'm not sure at all about their performance. So I suspect that at least some of the problem is related to your wrapper code.

jdtsmith commented 2 years ago

Good idea, I can try some timing with/without the convenience wrapper to see if there's much of a difference. Probably worth making a simple test framework. f-strings are straight translated to .format codes by the interpreter, so in fact should be faster than allocating and concatenating separate sub-strings. They've only just shown up in MP.

It's a good point about reading from flash: if everything can get updated in a few ms or so, you should rarely see the "herky-jerky" stuff I get. I mean I guess since the timer tick is interrupt driven and independent, no matter how fast you can do the updates, sometime a tick will occur in the middle of things, and the display will start updating when you didn't want it to. Hence the disable/enable are pretty useful no matter what the source of my slowdown, so thanks again for including.

jdtsmith commented 2 years ago

OK I did some investigating. Because I run everything through the __setattr__, it was actually trivial to count the number of widget attributes set during the time elapsed between pausing and resuming lvgl. This shouldn't involve the display at all. I get an average of 5.5 +- 1.6 ms per attribute using this. So clearly going to hit 40ms pretty quick.

This does include some other Python overheads like other method calls, etc., so I also tried an even simpler approach: using one single LVGL label widget, I set text, alignment, text color, and opacity repeatedly. Using my simplified API this looks like:

    l.text = "Testing"
    l.align = "CENTER"
    l.color = "BLACK"
    l.opa = 0.5

This yielded 3.3ms/attr. In the "raw" API:

    lo.set_text("Testing")
    lo.set_align(lv.ALIGN.CENTER)
    lo.set_style_text_color(lv.color_black(),0)
    lo.set_style_opa(128, 0)

this results in about 0.8-0.9 ms/attr. So clearly my API is adding a good bit of overhead (3-4x). But if you have 50 attributes to set, even the raw API will exceed the 40ms tick time and get you in trouble. My biggest "block" of updates altered 53 attributes. I'm going to see if I can tune my API to reduce overhead, but will still rely on the disable/enable functionality to avoid the tick firing at an inopportune time.

[Update: @micropython.native only managed to trim the time of my API modestly down to 2.9ms/attribute.]

amirgon commented 2 years ago

I haven't tried freezing or native code (but freezing commonly updated code does defeat one of the main strengths of MPY IMO!).

Usually you freeze only code that don't change much (such as libraries, infrastructure code etc.) For example, ili9341 driver is a frozen module. But I agree it does not make sense to freeze everything.

this results in about 0.8-0.9 ms/attr.

Actually it varies a lot. I just measured on ESP32@240mhz 0.75ms for set_text but only 0.16ms for set_style_opa.
That's probably because set_text performs more memory allocations/access than set_style_opa. gc-ram allocation and access is expensive especially on ESP32 with PSRAM because all gc-ram is allocated on the SPI-RAM and cache misses are very expensive.

jdtsmith commented 2 years ago

I don't have any PSRAM on my unit, but I also noticed set_text was the slowest of the bunch (and slower the longer the text). But in any case right now I'm dominated by the overhead of my "convenience API". Given that I'm often changing BG images, and that takes ~500ms to read from flash and shuffle over SPI, I don't really mind the extra 20-50ms or so delay; I just needed to keep it from happening during refresh. enable/disable do that nicely.

amirgon commented 2 years ago

enable/disable functions of the event loop solve this problem - I'm closing this issue. Feel free to reopen if there are any problems!