obsproject / obs-browser

CEF-based OBS Studio browser plugin
GNU General Public License v2.0
776 stars 220 forks source link

A deadlock caused by obs_enter_graphics() from CEF's main thread #333

Closed walker-WSH closed 2 years ago

walker-WSH commented 2 years ago

Operating System Info

Windows 10

Other OS

No response

OBS Studio Version

27.1.3

OBS Studio Version (Other)

No response

OBS Studio Log URL

reproduced sometimes

OBS Studio Crash Log URL

No response

Expected Behavior

Keep same order to request locks in different threads

Current Behavior

  1. obs_graphics_thread: In every render loop, obs_enter_graphics() is firstly requested. Then, sometimes BrowserSource::~BrowserSource() can be called after rendering browser source. In ~BrowserSource(), DestroyBrowser(async=false) is called, and it will not break until task is done in CEF's main thread.

  2. CEF's main thread: In BrowserClient::OnPaint or BrowserClient::OnAcceleratedPaint, obs_leave_graphics() is called from CEF's main thread. However, at this time, obs_leave_graphics() has been gotten by obs_graphics_thread, but obs_graphics_thread is blocked to wait CEF's main thread for doing task.

Wait for each other between the two threads.

Steps to Reproduce

This is mult-thread issue, it can't be reproduced every time. I just meet one time. But if browser source is to be destroied from obs_graphics_thread, this block can be reproduced easy.

Anything else we should know?

Maybe obs_leave_graphics should not be called from CEF's main thread.

jp9000 commented 2 years ago

Thanks for pointing this out. Definitely quite annoying.

walker-WSH commented 2 years ago

I think there are two ways to fix.

  1. Move logic from BrowserClient::OnPaint to browserSource::Tick Lock should be added for variables.

  2. Make sure ~browserSource won't be called during rendering source after getting graphics lock. Pseudo code for obs_graphcis_thread:

    while (1) {
        browser array; // reference ++1
        obs_enum_source(array);
    
        tick_sources();
        render_sources();
        render_displays();
    
        release browser array // reference --1
    }

I select the second method and related PR is https://github.com/obsproject/obs-studio/pull/5600 The reason I use this method is that it simultaneously fix another crash in: static uint64_t tick_sources(uint64_t cur_time, uint64_t last_time)

Later I will describe another crash here...

walker-WSH commented 2 years ago

I think there are two ways to fix. Need be discussed:

  1. Move logic from BrowserClient::OnPaint to browserSource::Tick Lock should be added for variables.
  2. Make sure ~browserSource won't be called during rendering source after getting graphics lock. Pseudo code for obs_graphcis_thread:
  while (1) {
      browser array; // reference ++1
      obs_enum_source(array);

      tick_sources();
      render_sources();
      render_displays();

      release browser array // reference --1
  }

I select the second method and related PR is obsproject/obs-studio#5600 The reason I use this method is that it simultaneously fix another crash in: static uint64_t tick_sources(uint64_t cur_time, uint64_t last_time)

Later I will describe another crash here...

@jp9000 Another crash I meet (I hope you can understand me...) :

1

jp9000 commented 2 years ago

I think these bugs can be addressed separately without necessitating storing incremented references to all sources per frame in a dynamic array. There should be a more optimal and simple way to do this. I'll make some fixes to address both of these issues.