BackburnerJS / backburner.js

A rewrite of the Ember.js run loop as a generic microlibrary
MIT License
392 stars 80 forks source link

drops later execution after sleep #353

Closed makepanic closed 5 years ago

makepanic commented 5 years ago

Hi, we're seeing that some later invocations aren't executed on schedule after a system comes back from sleep.

The system has a long running later (every ~15min = 900000ms). After the system comes back from sleep, it triggers (caused by a failing network request) a 4s later execution which sometimes isn't getting executed.

backburner-later-issue

To debug the issue, we start a regular setTimeout in parallel to the later execution. The setTimeout callback gets called 4s after executing it (see [global]).

The later execution looks like this:

https://github.com/BackburnerJS/backburner.js/blob/master/lib/index.ts#L646

https://github.com/BackburnerJS/backburner.js/blob/master/lib/index.ts#L654

From the looks of it, this means that the scheduled timer isn't registered, which means it won't execute on time. In the screenshot, we see the ember concurrency timeout never yields in the 4000 case (no [concurrency] executing later cb 4000 logged).

My guess is that it isn't scheduled and the next timer that triggers is the longer running one.

rwjblue commented 5 years ago

Thanks for reporting!

Can you share a specific step by step of the events that are occurring vs what you expect to be happening (I'm a tad confused about the order of operations, where the sleep is, etc)? If you can, it would also be helpful to see the backburner.getDebugInfo() at each step (this was a new API which landed in Ember 3.4.5).

It would also be helpful (for both problem isolation purposes and collaborative debugging) to try to make a reproduction in a more vanilla app.

makepanic commented 5 years ago

Sure i'll post the debugInfo once i can trigger the behavior again. Here's the app flow for this issue:

Given:

  1. app has a long running timer (interval = 15min) that checks for new version
  2. app does a sync using long-polling
while(true) {
    try {
        // do request
    } catch (e) {
        // handle error
        yield timeout(4000);
    }
}
  1. user closes the laptop lid / sends the system into sleep
  2. after a variable time, user opens the lid again
  3. browser had failed requests during sleep and opening the lid
  4. the returned timeout promise never resolves, as run.later never calls the callback

When placing a manual setTimeout near the run.later/timeout, it calls the callback as expected:

    ...
} catch (e) {
    // handle error
    setTimeout(() => console.log('setTimeout executed'), 4000);
    yield timeout(4000);
}

During console.log debugging, i found that the initial executeAt calculation points to the correct expected time of timeout execution. When looking at the code, it follow the branch flow where nothing gets scheduled.

My best guess is that nothing is scheduled as the new timer isn't the "first" one:

// we should be the new earliest timer if i == 0
if (i === 0) {
  this._reinstallTimerTimeout();
}

In the screenshot above, the new timeout got i = 6

rwjblue commented 5 years ago

My best guess is that nothing is scheduled as the new timer isn't the "first" one:

Sounds reasonable, but what happened to the first ones timer? FWIW, this sounds similar (or possibly the same) as what @kanongil reported in #354...

kanongil commented 5 years ago

This might be another case than #354, if the browser somehow dropped the timer (which would be a browser bug).

makepanic commented 5 years ago

This might be another case than #354, if the browser somehow dropped the timer (which would be a browser bug).

I don't think it's dropping the execution completely, it's backburner not scheduling a timer that triggers before the longer running one. I'm now trying to debug what the _timers list looks like when inserting the later timer after wakeup

makepanic commented 5 years ago

I think I now know what's the issue is.

browser behavior [1]

  1. upon sleep, the browser pauses timeouts
  2. upon awake, it resumes the timeout (without corrective early execution)

in context of my app

  1. on boot backburner schedules the long timeout (using executeAt)
timers = [(now + 15min)]
  1. on sleep, the timeout is paused causing executeAt to point to a wrong time
  2. after >15min, on resume
    • schedule 4s later
    • backburner determines new timer as the second one and won't register an early setTimeout
timers = [(now + 15min), (now + 15min + sleep + 4s)]
  1. after the resumed timer reaches 15min it'll work correctly again

[1] - i've only checked it in depth on linux chromium

example

screenshot from 2018-10-08 19-35-42

The first timer in the list points to a past executeAt = 19:15:05 when adding a new timer at 19:18:56.

A possible (cheap) way to correct this would be to look at the first item in the timers list and flush if executeAt points to the past.

kanongil commented 5 years ago

I just checked in Chrome and Safari for Mac, and they behave in the same way (pausing timeouts during system sleep). This is unfortunate.

I think the only reliable solution, is to reinstall the timer on each call to _later(). While this might look a bit weird on the already scheduled timers, it will ensure that new timers are handled properly.

kanongil commented 5 years ago

Your cheap solution would not work if it immediately goes to sleep, and wakes up slightly before the executeAt, and install the new timer. In that case it could still have to wait ~15 min to fire.