chromelyapps / Chromely

Build Cross Platform HTML Desktop Apps on .NET using native GUI, HTML5, JavaScript, CSS, Owin, AspNetCore (MVC, RazorPages, Blazor)
MIT License
2.98k stars 279 forks source link

App suspends setTimeout activity for unknown reason #230

Closed donotcodeit closed 3 years ago

donotcodeit commented 4 years ago

I have setInterval based JS code that is pinging server every 10 seconds. Recently we find suspicious log entries where 1 client may send multiple ping requests in a short period of time (1-2 seconds). It was hard to find out reason, but we found that setInterval sometimes stopping to work when window is inactive. And after activation it will go into callback n-times which was missed during inactivity.

Furthermore I was unable to reproduce this problem on a clean Chromely demo app. So I've tried to downgrade app from current state to most clean app until this problem becomes reproducible.

It's take out 5 days for me to make a minimum reproducible demo. Steps described in README.md. auto-app-suspend-demo.zip

I think it's somehow related to CEF, so I need your help guys, because I don't have experience with it. And C++ code is out of my expertise.

mattkol commented 4 years ago

@donotcodeit this is interesting.

Not tried yet. But I understand what you described.

Is there a way for temporary fix like checking if Windows active or not before sending the ping?

I will see if there is anything on CEF between now and the weekend.

Thanks.

donotcodeit commented 4 years ago

@mattkol We, probably, may make a fix for ping problem. But we also have several probably related bugs.

One was at app startup happens nothing until mouse is moved over app window. We have fixed it with workaround by sending DRAW message to window.

Second is some progress, which is implemented as sending from app backend (C#) to client some JS calls, may suddenly stop to be animated on client side. Same symptoms - until you will move mouse cursor over app window. Currently I'm working on minimal reproducible demo for this one.

mattkol commented 4 years ago

@donotcodeit

I locked the screen for about 5 mins @5:57:59. It did not seem to stop logging. From what you attached, it seems your project is based on Angular, React, Vue or similar framework. May be dependency on that has an impact?

Please see what I used - attached.

image

auto-app-suspend-demo.zip

donotcodeit commented 4 years ago

@mattkol we were able to reproduce it on another computer today. But one thing additionally: focus must be removed from Chromely app window. This makes repro steps stable.

donotcodeit commented 4 years ago

@mattkol here is a new demo. I've finished it today. At last. Very easy to reproduce and looks very related. Please take a look.

animation-stucks-until-mouse-over.zip

Update: After googling for a long time I haven't found any proofs of it being CEF defined behavior. So it's either Chromely package problem or misuse in my project. For me it looks like UI loop is waiting for user input messages. Maybe somehow related to custom thread for borderless mode.

GerHobbelt commented 4 years ago

The way I read this issue so far, there's two problems/questions:

  1. why is setInterval() acting up / behaving unexpectedly? (timers)
  2. why is the mouse behaving oddly? (UI responsiveness)

Prelim

Sounds a lot like the usual age-old setInterval() / requestAnimationFrame() issues in any browser to me (been there, etc., + T-shirt 😉 ): setInterval() is not 'dependable' if you want equally spaced intervals between your activities with no surprises, be it in Chromely or in a regular browser like Chrome. Might be best to treat that one as more of an 'advisable' than a 'guaranteed' timer. Anyway, others have written a zillion pages about it.

f.e. in browsers (in that sense, Chromely is another one as it uses the same engine): https://stackoverflow.com/questions/6032429/chrome-timeouts-interval-suspended-in-background-tabs

Note

Extra links for those that like this kind of stuff:

Anyway, lots of opinions about the reliability of setInterval() and all the other JavaScript timers, so let's have a test page:

Testing setInterval in Chromely... (looking at mouse responsiveness comes after)

https://gerhobbelt.github.io/frame-rate-distribution/ (repo at https://github.com/GerHobbelt/frame-rate-distribution)

To get a feel for it ("ground truth"), run that one in browser like Chrome or Chromium and observe the spread.

One of my observations: add some "long running code" (switch the "CPU Eater" ON in that page) to observe rather more "off the mark" intervals happening, including very short intervals. Empirically, on this dev box, today, at least I'm getting one or more very short intervals for every long interval where the "CPU eater" in the page blocked the timer event from firing as it's executing in the same (UI) JS thread. This is why I wrote "... is not dependable".

That benchmark code has been copied into your Chromely project (ZIP file taken from previous comment) and dumped into a repo (way easier to track what's done & by whom than ZIP files, which do not carry history like a git log) and here we observe the same stuff:

(Repo: https://github.com/GerHobbelt/chromely-issue-230 , commit list: https://github.com/GerHobbelt/chromely-issue-230/commits/master)

Framerate / setInterval only - quite minimal JS code in both event handlers

Reduce the JS code in the application to only that of the frame-rate-distribution repo and you see Chromely behaving just like Chrome browser. (commit https://github.com/GerHobbelt/chromely-issue-230/commit/9fd6f13e89dea4e0d9e07ecac36d7f9ecfee967e)

Screenshots of the stats:

image

image

Note the single counts for the long duration intervals: they do happen but that's about the longest time the UI will be unresponsive.

Now with the demo code

Same thing done with the demo code merged back in: commit https://github.com/GerHobbelt/chromely-issue-230/commit/7617e0f101fb913a8623efdc710097e0bb7a16d1, thus returning codetree back to state of commit https://github.com/GerHobbelt/chromely-issue-230/commit/cc392759d476cfe24ff9a61d72631bbe0f0acb66

rHO6H40BmV movie duration: ~1'10"

Note that the addition of your (OP) code results in a larger spread and noticeably slow intervals (high in ms). Of course, this makes sense as there's additional HTML and interaction: these stats were collected by scrolling down (mousewheel scroll) until we hit the bottom of that game list in there, plus one window resize (which was visually slow to rerender).

Here I wondered how the JavaScript code performs: have you run your code through a profiler? (See screenshot later on for a profile snapshot in another Chromely run)

Hm, mouse is behaving... oddly. Some of the time. 🤔

The odd bit is that mousewheel scrolling is mooth while drag-scrolling with the mouse pointer over the vertical scrollbar in the UI is quite unresponsive at times: this MAY be due to the DOM size (are you using a virtual list?) or CEF/Chromely behaviour. To be looked into: addressed further down in this message.

Rechecking who's probable cause...

Given the large intervals in that last run, I suspect the JavaScript code. BUT the oddly quite different responsiveness of scrollwheel scrolling vs mouse pointer drag of scrollbar, I wondered if this is maybe also due to Chromely / CEFGlue / CEF?

Here's another run with the basic setInterval benchmark app again:

Chromely setInterval benchmark clean only test 3 - ODtT0jFDCV movie duration: ~1'10"

When you watch the responsiveness in the animated GIF, you see a couple of mouse moves over the scrollbar and then delay before the UI is updated. That is NOT an animated GIF artifact but also observed IRL. Hence my preliminary conclusion: there's also something going on inside Chromely.

So we better try to find out if it's Chromely itself, CEF-Glue or CEF:

7FpNxujW1O movie duration: ~1'10"

that is the same benchmark webpage running in the Winforms example project that comes with CEF-Glue: it exhibits the same 'jumpy/delayed' responsiveness as Chromely with the mouse+scrollbar, so that would take Chromely itself off the list of prime suspects AFAICT.

NOTE: the WPF example for CEF.Glue disables GPU and a bunch of other CEF/Chrome settings and renders rather fuzzy, so I did not use that one.

While the WPF one has my utmost interest (for another project I'm working on #2+#215+#113), it's better to compare Chromely against the WinForms demo anyway as the amount of Windows Message Loop overhead in that one quite a lot less than for WPF and Chromely advertizes itself as not loading the Message Loop anyway: "on .NET Core using native GUI".

I haven't looked beyond CEF-Glue, so it may be either that one of CEF itself.

Also note that I tested against the bleeding edge master branch code of CEFGlue for that WinForms example, while Chromely uses the CEFGlue NuGet package/release, so there's some disparity there, but the observed behaviours are close enough for me to point an initial finger at CEF-Glue or deeper: CEF/Chromium.

But the Application JavaScript code does not help

As I observed a noticable difference in responsiveness between the bare nuckles benchmark (with a few click handlers) and even decent (or rather: expected hickups) behaviour when the "CPU eater" is turned on, there's good cause to investigate the OP JS performance by running it through the CEF/Chromium provided DevTools profiler.

expected hickups as that bit of JS is running in the JavaScript UI thread, hence you indeed get visited by ~1100ms hickups in UI responsiveness. Nothing weird there.

The first attempt there took out my box as grabbing 60+ seconds of data turned out to be a tad too much: had to hard kill the app. And before you ask: this and subsequent tests were with the OP JS app code included and "CPU eater" OFF all the time, so no intentional 1100ms lockups to be expected!

The next round produced this:

image

Whoa Nelly!

By the way: that's once again obtained by doing this:

As you can see from the screenshot: the profiler reports the JS is maxing out one core.

This matches an earlier MSVC2019-run C# project profile run, which reported nothing fancy, apart from over 70% of the CPU time spent in 'Native' code, i.e. outside the C# code scope, pointing at CEF+website JavaScript loading the core rather than Chromely C# code (no screenshot provided, alas)


In Closing

I hope this investigation helps find what's going here and how the mouse responsiveness can be improved.

From what I've observed, apart from the JS application code in OPs example ZIP, which could be improved when further investigating DevTools Profiler results, there is some odd responsiveness behaviour re Mouse+scrollbar dragging vs. mousewheel scrolling. (my DevTools JavaScript/TypeScript profile run profile dump is included in the repo (commit https://github.com/GerHobbelt/chromely-issue-230/commit/5bfa39b6b75e1dd49194c7668461c3c63a1fde28) -- note that I did kick packer but didn't spend time to produce a non-mangled output. ELTTR. Minification always done way too early in the dev cycle if you ask me, but my excuse is I shared a condo with the Stattler & Waldorf bunch 😉 )

Given that the same 'odd' (though short) mouse responsiveness stutters seem to happen in the CEFGlue winForms demo, my bet right now is it's not due to Chromely but something deeper -- I may be mistaken though, so verify my observations.

This issue peaked my interest as I'm looking at possibly migrating from a WPF environment to X-Platform HTML/CSS/JS UI stack, while preferably keeping the actual work going in C#: Chromely has my attention. However, given these tests I did now, I think it's maybe a a little too tough to obtain a very responsive UI that way. (I haven't checked out Electron yet as that was considered possibly a bridge too far for another reason: C# for the backend code, but given the many Electron apps out there, I wonder what's really wrong here re responsiveness.)

All in all, it looks to be a compound issue:

To be investigated further.

Hope this helps.

mattkol commented 4 years ago

@GerHobbelt thanks for your detailed analysis. Will find time to study it closely.

mattkol commented 4 years ago

@donotcodeit

Your last comment and zipped file, I am not quite sure what I should be looking for. I could run it and I see the Downloading figures increase whether app has focus or not.

But ..the following is what I think.. so far. Not sure if it complements @GerHobbelt findings or not ...

The issue seems related to message not being received and not pumped when Idle. This is because of use of GetMessage instead of PeekMessage... Please see issue 1 and issue 2.

Basically there are 2 main ways to pump messages - What we have and Realtime

Based on explanation from @prasannavl

Yes. RealtimeEventLoop will use 100% of the CPU. 
It's meant for games and the likes that wants complete control over the event loop.

it was planned to do a realtime option at the time, but just never thought about it anymore .. or more or less ..no one brought it up .. may be until now (if this is the issue).

The explanation is here -- do a message loop on idle.

CefSharp approaches: On Idle - This may require PeekMessage that can impact CPU usage. Using a timer - this seems like a more recent recommendation from CEF.

So if you are ready to test and give feedbacks we can look at the second approach together. If this works we can make it configurable. What we have right now remains default.

xplicit commented 4 years ago

@mattkol on the second demo progress bar is increased only when user moves mouse inside chromely window. If user does not move mouse, UI freezes in short time (1-2 seconds) and progress bar stops to increase a value. To unfreeze UI and move progress to current state you have to move mouse again.

I tried to understand what can impact on that behavior and found that UI freeze occurres when chromely creates new process with "--render" option (I don't remember how this option is named exactly, this was couple months ago). If send WM_PAINT to the main window in short time after render process is created UI unfreezes and become responsive like user moves a mouse.

GerHobbelt commented 4 years ago

But ..the following is what I think.. so far. Not sure if it complements @GerHobbelt findings or not ...

Complements, absolutely. I hadn't looked into the Windows Message Loop handling coding itself, was taking it top-down, so taking sample code ZIP as-is and see what happens from there. (Off topic: Thanks for your pointers as they also help me a lot while I investigate technological viability of Chromely or other CEFGlue for my own project!)

My mouse (=~ Windows Message Loop) behaviour observations are not the same as what OP (@donotcodeit) ran into: behaviour here is not severely impacted by going out-of-focus with mouse (as per OP's README) or with Alt-TAB/click switch to other application as attempts to trigger focus-lost messages from the Windows Message Loop. I do see noticeable differences between two modes of scrolling page content: mouse wheel and mouse click+drag over vertical scrollbar at right side. Same Message Loop, different WM_* messages.

Your pointers are very useful for any message pump related work.

On another note, as this happened just now in the early morning: I had more "inexplicable weirdness" happening to me when I executed an earlier very simple Chromely sample app from Microsoft Dev Studio 2019 via [F5] (exec in debugger): that one was first compiled and run back in April. Just recovered from that one, hence:

Extra process to test: nuke CEF et al and rebuild from scratch

@donotcodeit : you may want to try this on your problem machine / problem software combo too and see if anything changes behaviour.

How To Do:

Rationale

Chromely loads CEF binaries from their distro page using https://github.com/chromelyapps/cef-binaries-downloader and 'caches' them locally.

Once they're downloaded, it doesn't do so again as the app is now starting quickly and loading CEF from local cache.

I wanted to make sure that what I saw was not caused by CEF/Chromium bugs (rendering in my case this morning) that might have been resolved between then (first run of my app) and now. Hence 'fetch latest release of everything in the same timeframe' and I've found a total wipe is the quickest way to do this for complex/compound projects.

Might be worth a shot for you (@donotcodeit) as I don't see your mouse/behaviour problem on my machine, but difference there is that I started compiling and running the project only yesterday.

Best of luck!

xplicit commented 3 years ago

We could not reproduce this issue on Chromely 5.1 anymore.