Closed kainino0x closed 4 years ago
This should be in whatwg/html, I think, as I believe this is true in general. (not just for webgpu)
Definitely. This issue just exists as a discussion thread, but if we end up deciding a change is needed, then we can take it to html.
I agree that Chrome's interpretation seems more immediately useful for me, and honestly matches my expectations of what a sane application framework would do.
A better example is this:
async function frame() {
requestAnimationFrame(frame);
drawScene();
await Promise.resolve(); // no-op await
drawHUD();
}
requestAnimationFrame(frame);
The main point for me is that if you await a resolved promise in rAF, it should still present the results of drawHUD(). As it stands Chrome will return from frame() as soon as it hits the await
and miss presenting the results of drawHUD() even though nothing async really happened. Draining microtasks would fix this as it will then reach the end of drawHUD() before presenting.
As it stands, in Chrome you must not use any await
at any point anywhere in the entire call tree made from a rAF callback - potentially covering a large amount of complex code in a real framework - even if nothing async really happens, otherwise you miss presentation. This is quite a serious limitation since awaiting to pause the render loop is actually a useful feature.
I think it's reasonable to say that all rendering in the rAF callback must finish within the same task, rather than microtask. The important thing is not whether the thing you await resolves as a task or microtask; the important thing is that awaiting resolved promises does not cause it to miss presentation.
Actually, for Kai's initial case, isn't it actually worse than that? Let's assume that the promise is guaranteed to resolve between the current frame and the next frame. Since requestAnimationFrame was called before waiting on updateHUDState(), this would mean that drawHUD would be called twice the next frame. I assume the order is defined, but I'm not enough of a spec guy to know the details.
This is quite a serious limitation since awaiting to pause the render loop is actually a useful feature.
I'm curious how you would use this in practice. Because you don't have any application-level control over microtask ordering (you can't set priorities of things you await, basically), and you can't guarantee or even validate that an arbitrary Promise will complete in a microtask rather than a task, it feels a bit like playing with fire to me. Using await to pause the render loop in an attempt to switch to some other work feels weird, because you don't know what work you're switching to or when the control would come back to requestAnimationFrame.
Here's an example of what I mean:
async function frame()
{
drawScene();
await waitForPauseMenuToResume();
requestAnimationFrame(frame);
}
Normally when the game is running waitForPauseMenuToResume
will just exit and return a resolved promise. Then when you pause the game, it will pause the render loop and only once unpaused will it proceed and call rAF() again.
However as it stands in Chrome, with this code running normally and not paused, it misses presentation every frame solely because it ran an await
. Presenting at the task level solves this.
Actually, for Kai's initial case, isn't it actually worse than that? Let's assume that the promise is guaranteed to resolve between the current frame and the next frame. Since requestAnimationFrame was called before waiting on updateHUDState(), this would mean that drawHUD would be called twice the next frame. I assume the order is defined, but I'm not enough of a spec guy to know the details.
That's correct. If it was already resolved, then the microtask occurs right after the frame is presented, so it ends up before the next rAF callback. If it was not, then it won't happen until an arbitrary point in an arbitrary later frame (as a microtask of whatever task resolves the promise). That's still always between one frame's present and the next frame's rAF callback (the order never swaps).
Normally when the game is running
waitForPauseMenuToResume
will just exit and return a resolved promise. Then when you pause the game, it will pause the render loop and only once unpaused will it proceed and call rAF() again.
I think this case is already OK. The timeline should be (in chrome):
frame
: requestAnimationFrame(frame)
OK, but then this ordering is broken?
async function frame()
{
await waitForPauseMenuToResume();
drawScene();
requestAnimationFrame(frame);
}
It seems like a nasty gotcha.
Yes.
Assuming I'm following the large amount of unspoked-but-assumed code correctly ( ;) ), if the game was paused, then your pause menu would never be drawn, even if Chrome followed the existing spec.
It's hard because I'm sure you have more sensical use cases here and collapsing that down into a shorter snippet to communicate what you mean is tricky.
I was guessing in that example the pause menu is rendered as DOM, not part of the canvas.
There are lots of other reasons to await
in a rAF callback: going to storage to load a savegame, asynchronously loading textures for the next scene, posting to the DOM to use an API not in a Worker when using OffscreenCanvas, and so on. In all these cases it's the intent that these operations suspend the render loop.
We ran in to exactly this issue with our engine in Chrome, and now we have a big pile of unintuitive code to make sure no await
statements appear anywhere in the critical rAF call tree.
Some references to where this is spec'ed:
The HTML5 spec defines the order these things occur in. Step 10.11 is "run the animation frame callbacks." Within that, step 3.3 is "Invoke callback." Within that, step 11 is the actual javascript function call, and step 14.2 is "Clean up after running script." Within cleaning up after running script, step 3 is "perform a microtask checkpoint" which drains the microtask queue.
Yep, it's clear from my read as well that Chrome is out of spec, so paths forward are one of:
Firefox implements the spec here. Interestingly, it seems to used to be the other way around: https://github.com/whatwg/html/issues/2637
For example, this is why it's sort of annoying to implement requestPostAnimationFrame as in https://hackmd.io/lvtOckAtSrmIpZAwgtXptw#Use-requestPostAnimationFrame-not-requestAnimationFrame , rather than just enqueuing a microtask.
Interestingly, it seems to used to be the other way around: whatwg/html#2637
Interesting. Is there any discussion as to why this changed?
Why does it make the requestPostAnimationFrame
polyfill more difficult? setTimeout(task)
should enqueue a non-micro task (it will be after any already-enqueued tasks).
Would Chrome's behavior make it easier to get the timing 'right'* (so you can just Promise.resolve().then(task)
)?
Interestingly, it seems to used to be the other way around: whatwg/html#2637
I'm not sure, but I think this might be a little more nuanced - it's talking about the timing relative to relayout, not just present.
setTimeout's nesting rules make it unpalatable, because you might run afoul of them without realizing.
Chrome's non-spec behavior would make this particular functionality easier, at the expense of needing the painful anti-async changes that @AshleyScirra. As the spec is today, you can make the choice yourself between always-cleaned-up microtasks, and deferred full (non-micro) tasks.
setTimeout's nesting rules make it unpalatable, because you might run afoul of them without realizing.
Shouldn't it be okay in your rPAF polyfill, because setTimeout is always called from rAF, and therefore is not nested?
postMessage-as-setTimeout-zero has been an open secret for ten years: https://dbaron.org/log/20100309-faster-timeouts It's mentioned explicitly in the setTimeout docs on MDN: https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setTimeout#Reasons_for_delays_longer_than_specified
So yes, it's "probably fine" if indeed it's called from rAF, but I like reliable primitives. :)
I think that postMessage-as-setTimeout-zero needs to be fixed so it is throttled, and setTimeout-from-rAF needs to be well specced and implemented so it is not throttled.
There was also a proposal for window.setImmediate
which had similar behavior to postMessage, but guaranteed zero... I guess rPAF is that under a new name.
Is there a microtask drain with this proposed rPAF, or with any of the polyfills? If people switched to rPAF, would code work or not?
I think it's confusing enough and overlaps with enough other work inside WHATWG that it's worth an issue in whatwg/html.
rPAF isn't a setImmediate, it's a callback that happens "right" after the rAF-callback, present
sequence.
https://github.com/WICG/requestPostAnimationFrame/blob/master/explainer.md
I think that postMessage-as-setTimeout-zero needs to be fixed so it is throttled, and setTimeout-from-rAF needs to be well specced and implemented so it is not throttled.
You're welcome to file bugs, but the state of the world sort of works for me. FWIW dbaron is Mozilla's TAG rep iirc, though it's no guarantee they stand by the setZeroTimeout thing this many years on.
This may be going off on a tangent somewhat, but if we have queueMicrotask
, why not also add queueTask
? It would avoid the need for having to use timers or postMessage.
Except for the throttling issue, my understanding is that is what setTimeout
does. I think queueTask
would need the same throttling anyway.
Also question: does any of this apply to XRSession.requestAnimationFrame? All of it?
I don't know how that rAF is specced, but we should definitely make sure it has the same ordering between presentation and microtasks.
We got queueMicrotask
even though you can do Promise.resolve().then(...)
- I think there's a good argument that it illustrates the intent better, hence it may well make sense to have queueTask
as well.
Empirical evidence (i.e. me trying it just now) currently suggests that you cannot do the same rPAF trick with setTimeout on an XRSession.requestAnimationFrame. This probably needs to be brought up with whatwg/html then to ensure that we're consistent about timing.
@toji
Opening this as a new issue because I don't want to derail #506. cc @magcius
Originally posted by @litherum in https://github.com/gpuweb/gpuweb/pull/506#issuecomment-590640169
Last I checked, Chromium actually implements this the other way (i.e. incorrectly; I'm assuming your reading of the spec is correct). It does not drain the microtask queue between the rAF callback and the present.
I was deep in an email conversation on this topic several months ago with @AshleyScirra, which I irresponsibly dropped the ball on and haven't replied (sorry, Ash!). Ash, evidently per spec (sorry I didn't check before), your desired behavior is actually the correct one, and Chromium is buggy.
That said... I'm actually very skeptical that the current spec text is how it ought to be. Take some rAF callback like this: