lvgl / lv_binding_micropython

LVGL binding for MicroPython
MIT License
237 stars 156 forks source link

Crash in `get_native_obj` after many event / callback interactions #336

Closed bwhitman closed 3 months ago

bwhitman commented 3 months ago

I have lv_binding_micropython running on Tulip on both desktop (MacOS/SDL) and MCU (ESP32S3). Users write their own UIs in Micropython. We're using this library's main branch from last month, so LVGL 9.0.0.

If many events are fired (hundreds, for example, sliding a slider up and down dozens of times), we eventually always crash with the following stacktrace (this is on MacOS):

* thread #9, stop reason = EXC_BAD_ACCESS (code=1, address=0x4330338c)
    frame #0: 0x0000000100435648 tulip`get_native_obj(mp_obj=0x000000013006bb80) at lv_mpy.c:150:9
   147      const mp_obj_type_t *native_type = ((mp_obj_base_t*)mp_obj)->type;
   148      if (native_type == NULL)
   149          return NULL;
-> 150      if (MP_OBJ_TYPE_GET_SLOT_OR_NULL(native_type, parent) == NULL ||
   151          (MP_OBJ_TYPE_GET_SLOT_OR_NULL(native_type, buffer) == mp_blob_get_buffer) ||
   152          (MP_OBJ_TYPE_GET_SLOT_OR_NULL(native_type, buffer) == mp_lv_obj_get_buffer))
   153         return mp_obj;
Target 0: (tulip) stopped.
(lldb) bt
* thread #9, stop reason = EXC_BAD_ACCESS (code=1, address=0x4330338c)
  * frame #0: 0x0000000100435648 tulip`get_native_obj(mp_obj=0x000000013006bb80) at lv_mpy.c:150:9
    frame #1: 0x000000010044ca48 tulip`mp_get_callbacks(mp_obj=0x000000013006bb80) at lv_mpy.c:218:30
    frame #2: 0x000000010044c9d4 tulip`get_callback_dict_from_user_data(user_data=0x000000013006bb80) at lv_mpy.c:738:13
    frame #3: 0x0000000100454554 tulip`lv_obj_add_event_cb_event_cb_callback(arg0=0x00000001702593e8) at lv_mpy.c:11571:26
    frame #4: 0x000000010033eae0 tulip`lv_event_send(list=0x000000010067bfb8, e=0x00000001702593e8, preprocess=false) at lv_event.c:76:13
    frame #5: 0x0000000100355a24 tulip`event_send_core(e=0x00000001702593e8) at lv_obj_event.c:342:11
    frame #6: 0x00000001003558a0 tulip`lv_obj_send_event(obj=0x000000010067be28, event_code=LV_EVENT_VALUE_CHANGED, param=0x0000000000000000) at lv_obj_event.c:64:23
    frame #7: 0x00000001003c8bfc tulip`update_knob_pos(obj=0x000000010067be28, check_drag=true) at lv_slider.c:484:27
    frame #8: 0x00000001003c7e50 tulip`lv_slider_event(class_p=0x000000010064a538, e=0x00000001702596d8) at lv_slider.c:141:9
    frame #9: 0x0000000100355bd4 tulip`lv_obj_event_base(class_p=0x0000000000000000, e=0x00000001702596d8) at lv_obj_event.c:86:5
    frame #10: 0x00000001003559ec tulip`event_send_core(e=0x00000001702596d8) at lv_obj_event.c:339:11
    frame #11: 0x00000001003558a0 tulip`lv_obj_send_event(obj=0x000000010067be28, event_code=LV_EVENT_PRESSING, param=0x0000000100671ac8) at lv_obj_event.c:64:23
    frame #12: 0x000000010037ce24 tulip`send_event(code=LV_EVENT_PRESSING, param=0x0000000100671ac8) at lv_indev.c:1541:5
    frame #13: 0x000000010037c4e8 tulip`indev_proc_press(indev=0x0000000100671ac8) at lv_indev.c:1194:16
    frame #14: 0x0000000100379a54 tulip`indev_pointer_proc(i=0x0000000100671ac8, data=0x0000000170259808) at lv_indev.c:664:9
    frame #15: 0x0000000100379410 tulip`lv_indev_read(indev_p=0x0000000100671ac8) at lv_indev.c:235:13
    frame #16: 0x0000000100379044 tulip`lv_indev_read_timer_cb(timer=0x0000000100671be8) at lv_indev.c:190:5
    frame #17: 0x0000000100345eac tulip`lv_timer_exec(timer=0x0000000100671be8) at lv_timer.c:300:59
    frame #18: 0x0000000100345bdc tulip`lv_timer_handler at lv_timer.c:105:16
    frame #19: 0x00000001004977a8 tulip`lv_task_handler at lv_api_map.h:60:12
    frame #20: 0x000000010049778c tulip`mp_lv_task_handler(arg=0x0000000000000006) at modtulip.c:445:5

We are calling lv_task_handler as suggested, via mp_sched_schedule at every frame of the display. We're not sure how to debug this. Is there any advice or more detail on what's happening in get_native_obj we can look into? I should add we've turned on WARN logging and nothing is appearing there.

bwhitman commented 3 months ago

We think we fixed this. We were not using LV_STDLIB_MICROPYTHON. It was the gc collect running that was crashing everything, trying to free ram it hadn't created.