hibiyasleep / OverlayPlugin

no longer (or couldn't be) maintained - please usr fork of ngld's one.
https://github.com/ngld/OverlayPlugin
Other
128 stars 38 forks source link

Crash on startup #17

Closed danakj closed 7 years ago

danakj commented 7 years ago

OverlayPlugin can crash on startup. It does not happen every run, suggesting that there is a race. I think between BeginRender() and OnCreated().

This is the exception:


2017-08-16T09:48:40 Unhandled Exception System.NullReferenceException: Object reference not set to an instance of an object. at Xilium.CefGlue.Interop.cef_browser_t.get_frame_identifiers(cef_browser_t self, UIntPtr identifiersCount, Int64* identifiers) at Xilium.CefGlue.CefBrowser.GetFrameIdentifiers() at RainbowMage.HtmlRenderer.Renderer.<>cDisplayClassb.ba(CefBrowser b) in c:\Users\Dana\s\OverlayPlugin\HtmlRenderer\Renderer.cs:line 332 at System.Collections.Generic.List1.ForEach(Action1 action) at RainbowMage.OverlayPlugin.OverlayBase`1.NotifyOverlayState() in c:\Users\Dana\s\OverlayPlugin\OverlayPlugin.Core\OverlayBase.cs:line 357 at RainbowMage.HtmlRenderer.Renderer.OnLoad(CefBrowser browser, CefFrame frame, Int32 httpStatusCode) in c:\Users\Dana\s\OverlayPlugin\HtmlRenderer\Renderer.cs:line 249


Note that this Renderer.cs:line 332 is: public void ExecuteScript(string script) { this.Browsers.ForEach((b) => { foreach (var frameId in b.GetFrameIdentifiers()) <------------ this line { var frame = b.GetFrame(frameId); frame.ExecuteJavaScript(script, null, 0); } }); }

Which suggests that this.Browsers is being modified /while/ we iterate through it.

I've also seen this exception manifest in my own OverlayPlugin addon as: 2017-08-16 9:37:26 AM: Error: Bars: Exception Object reference not set to an instance of an object. source: CactbotOverlay at Cactbot.CactbotOverlay.DispatchToJS(JSEvent e) at Cactbot.CactbotOverlay.<.ctor>b__61_1(GameExistsEvent e) at Cactbot.CactbotOverlay.SendFastRateEvents()

This crash occurs on a timer thread that checks to see that this.Overlay.Renderer.Browser is not null and Browser.IsLoading is false, then tries to use Browser but it has become null. This most frequently happens with an html page that is more heavy to load/slower to get to OnLoad.

I suspect that https://github.com/hibiyasleep/OverlayPlugin/commit/1f0d4bbc028665a2903ae83ec8205c80f62e9130 and https://github.com/hibiyasleep/OverlayPlugin/commit/aca5b59b8a60d2143b208976eec3261b5aa5fa9b are related.

I've also seen it happen at shutdown, with this stack trace:


2017-08-16T10:18:30 Unhandled Exception System.NullReferenceException: Object reference not set to an instance of an object. at Xilium.CefGlue.Interop.cef_browser_t.is_same(cef_browser_t self, cef_browser_t that) at Xilium.CefGlue.CefBrowser.IsSame(CefBrowser that) at System.Collections.Generic.List1.FindLast(Predicate1 match) at RainbowMage.HtmlRenderer.Renderer.OnBeforeClose(CefBrowser browser) in c:\Users\Dana\s\OverlayPlugin\HtmlRenderer\Renderer.cs:line 225


danakj commented 7 years ago

Renderer::UpdateRender() is called in two places.

  1. When the Windows Form loads from OverlayForm_Load().
  2. When the OverlayForm::Url is set from OverlayForm::Url::set.

When I run things in a debugger here's two sequences I see.

When it doesn't crash is:

  1. OverlayBase::InitializeOverlay() -> OverlayBase::Navigate() -> OverlayForm::Url::set() -> BeginRender()
  2. OverlayForm_Load() -> BeginRender()

When it does crash it is:

  1. OverlayBase::InitializeOverlay() -> OverlayBase::Navigate() -> OverlayForm::Url::set() -> BeginRender()
  2. LifeSpanHandler::OnAfterCreated() -> Renderer::OnCreated()
  3. OverlayForm_Load() -> BeginRender()
  4. BeginRender() calls EndRender() which has a non-empty Browser list from (2)
  5. EndRender() calls browser.GetHost().Dispose() on Main Thread [4516]
  6. No Name Thread [9072] runs OnBeforeClose().

In this case OnBeforeClose sees this.Browsers list as having a browser in it which has been Closed already, and we have a thread race with one thread setting Browsers to null and the other using it.

Do note that this isn't the crash I see in the ACT log in the first comment, when I run in the debugger it triggers an earlier race.

danakj commented 7 years ago

Thread-safety wise here is what I see:

EndRender is called from Main Thread (where OverlayBase runs) OnCreated is called from No Name Thread (Cef thread) OnBeforeClose is called from No Name Thread (Cef thread) ExecuteScript is called from No Name Thread (Cef thread)

So EndRender can not change this.Browsers without creating a thread race. The other 3 appear to share a thread.

Also the UpdateRender() call in OverlayForm_Load() appears to be pointless, and just causes the browser windows to be destroyed and recreated, loading the HTML/JS twice and causing Overlay::Renderer::Browser to become null briefly after first being set to non-null. I think that UpdateRender() in OverlayForm_Load() should be removed to avoid loading the HTML twice, but that doesn't solve the race as when the user hits reload the Url is changed which does UpdateRender() so there may be more than one call to UpdateRender() during the addon's lifetime still.

I am trying to understand the motivation for the list of Browsers instead of just tracking a single browser window. I see that before this fork, it also tracked a LastBrowser for showDevTools, but I can't see any case where more than one Browser will ever exist. The browsers are all destroyed on BeginRender, which then creates a single Browser.

In order to avoid thread races, the Browsers list should only be modified on a single thread, or behind a semaphore. But to greatly simplify things, I suggest removing the Browsers list and only track a single browser window.

If the list of Browsers is kept, then some main-thread structure is needed to track which browsers were disposed to avoid double-dispose.

If an addon wants to use the Renderer.Browser field, it should always overload Navigate(), prevent any use of Renderer.Browser before calling OverlayBase::Navigate(), wait for the BrowserLoad event to trigger (Renderer.Browser should be null), then wait for Renderer.Browser to become non-null. Otherwise it races and the Browser may become null while the addon is trying to use it causing a crash.

danakj commented 7 years ago

I tried to implement the above and realized a few issues, so I have a modified proposal which I will send in a pull request. The modified proposal is:

hibiyasleep commented 7 years ago

Thanks for amazing issue! I don't have much C# knowledge, my patch have many errors.

As my understanding, creating new pop-up (or any) window calls LifeSpanHandler::OnAfterCreated, and this Browser overrides Renderer.Browser - makes main overlay uncontrollable (yes, all click events delivered to last window). However I throught second and later Browser should be disposed on requested, so I created Renderer.Browsers.

ps: as seen by aca5b59b8a60d2143b208976eec3261b5aa5fa9b, non-overlay windows are also needed to receive data events.

danakj commented 7 years ago

Thanks for explaining why there is a list, I'll have a look at a solution that covers window.open()

hibiyasleep commented 7 years ago

Closing with #18.