hoffstadt / DearPyGui

Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
https://dearpygui.readthedocs.io/en
MIT License
13k stars 678 forks source link

set_frame_callback is racy and can lose callbacks #2269

Open v-ein opened 8 months ago

v-ein commented 8 months ago

Version of Dear PyGui

Version: 1.10.1 Operating System: Windows 10

My Issue/Question

set_frame_callback can be used at runtime to set a callback to run at the nearest frame. In fact, it's probably the easiest way to schedule something to run in the handlers thread (e.g. from within another user thread).

Here's a piece of code that is supposed to do that:

    with dpg.mutex():
        dpg.set_frame_callback(dpg.get_frame_count() + 1, next_frame_callback)

Notice the use of dpg.mutex() - it guarantess that no frame gets rendered while we're still computing the frame number for set_frame_callback. Without the mutex, this code may run into a problem if the main thread starts rendering a frame between the calls to get_frame_count() and set_frame_callback(): the frame number we'll get will be outdated, and the callback won't get called.

With the mutex, the code seemingly should run without issues 100% of the time. From the API point of view, at least. In practice, due to the way set_frame_callback is implemented, there's a slight chance that the callback won't get called and will be lost. The root cause is that to store next_frame_callback and the frame number, set_frame_callback() schedules one more callback to the handlers thread. If it happens on the edge of a new frame, there's a chance that the "internal" callback gets called too late, past the point where frame callback for this new frame is to be scheduled. Here's exactly how it happens:

  1. dpg.mutex() locks the mutex and blocks the rendering thread.
  2. set_frame_callback() gets called and parses its arguments. Let's say the last frame rendered was X, then set_frame_callback() gets called with the frame X+1.
  3. set_frame_callback() calls mvSubmitCallback(), which adds the internal callback to the queue. set_frame_callback() then exits back to Python.
  4. dpg.mutex() releases the mutex.
  5. The rendering thread gets the mutex and starts rendering a new frame. It increments the frame number - the current frame is X+1.
  6. The rendering thread calls mvFrameCallback() for frame X+1.
  7. mvFrameCallback() doesn't find any callbacks for this frame, and does nothing.
  8. In the handlers thread, the internal callback of set_frame_callback() finally gets called. It adds a frame callback for frame X+1 to callbackRegistry->frameCallbacks, but it's too late. That element will never be fetched again.

To Reproduce

Steps to reproduce the behavior:

  1. Run the minimal example
  2. Click the "Click me" button to schedule a frame callback to run at next frame.
  3. When the frame callback is called, it prints a message to the console. If it gets lost, the "missed" counter is incremented.
  4. Make sure no callbacks are lost when both checkboxes are off. In this case, the callback is scheduled almost immediately after rendering the last frame, and there's plenty of time for set_frame_callback() to do its job.
  5. Set the first checkbox. This offsets set_frame_callback() to the dangerous point when it runs immediately before rendering of the next frame starts.
  6. Click the button several dozen (or hundreds) times. You'll see some callbacks are lost. Since it's a race condition, the actual loss depends on your luck - expect losses between 1% and 10%.
  7. Set both checkboxes. In this case, the internal callback of set_frame_callback() is offset even further and has no chance to complete in time. All 100% of callbacks are lost.

Expected behavior

No callback loss if upon return from set_frame_callback the target frame is yet to be rendered.

Screenshots/Video

image

Standalone, minimal, complete and verifiable example

import time
import dearpygui.dearpygui as dpg

dpg.create_context()
dpg.setup_dearpygui()
dpg.create_viewport(title="Test", width=600, height=600)

missed = 0
total = 0

def on_frame(frame):
    global missed
    missed -= 1
    print(f"Frame callback fired at frame {frame}")
    dpg.set_value("missed", f"Missed callbacks: {missed}")

def test_frame_callback():
    global missed, total
    total += 1
    dpg.set_value("total", f"Total callbacks scheduled: {total}")

    missed += 1
    # The mutex guarantees that no frame gets rendered while we're setting the
    # callback.  This way we can always set the callback to fire exactly at the
    # next frame, by deducing frame number from the current frame.
    with dpg.mutex():
        # A long sleep guarantees the frame gets rendered as soon as we exit the mutex.
        if dpg.get_value("delay1"):
            time.sleep(0.1)
        dpg.set_frame_callback(dpg.get_frame_count() + 1, on_frame)

    # A bit of sleep again to postpone the callback used internally by set_frame_callback
    if dpg.get_value("delay2"):
        time.sleep(0.1)
    dpg.set_value("missed", f"Missed callbacks: {missed}")

with dpg.window():
    dpg.set_primary_window(dpg.last_item(), True)
    dpg.add_checkbox(label="Make it racy (run set_frame_callback right before next frame)", tag="delay1")
    dpg.add_checkbox(label="Guarantee callback loss (offset internal callback into next frame)", tag="delay2")
    dpg.add_button(label="Click me", callback=test_frame_callback)
    dpg.add_text(tag="total")
    dpg.add_text(tag="missed")

dpg.show_viewport()
dpg.start_dearpygui()
dpg.destroy_context()
v-ein commented 7 months ago

This issue can be fixed in different ways. First, I don't see any good enough reason for the mvSubmitCallback() call within set_frame_callback(). By adding the frame callback to the map immediately, I've been able to fix the issue, and set_frame_callback() then can be used together with the mutex to schedule a callback for the next frame.

However, thinking more about it... Pretend that we didn't use a mutex but just ran, say, set_frame_callback(2, my_callback), and the frame 2 started rendering concurrently with set_frame_callback. Then it would lose the callback anyway, even though in the caller code, the last frame before set_frame_callback was 1, not 2. That is, what should set_frame_callback do if it's called a bit too late and misses the target frame? It might be better to schedule the frame callback right away if the target frame has already passed.