w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
332 stars 56 forks source link

scheduler.postTask() API #338

Closed spanicker closed 4 years ago

spanicker commented 5 years ago

こんにちはTAG!

I'm requesting a TAG review of:

Further details (optional):

You should also know that... This is a very early proposal and we are seeking broad feedback.

We'd prefer the TAG provide feedback as (please select one):

dbaron commented 5 years ago

After a first read-through, this seems like useful stuff.

Some of this (particularly the comment in the explainer about misuse of Promise.prototype.then for things that shouldn't block the browser's event loop) got me thinking a little about the tie between syntax and the various types of timing options on the platform. As a result of adding new types of timing (e.g., microtasks) for things where that timing seemed like the right thing for the "canonical" use of a new API, we now have a set of syntaxes where people might choose between say, Promises and setTimeout based on API familiarity or usability, even when they have non-obvious differences in semantics that are important. It makes me wonder:

slightlyoff commented 5 years ago

The link to the filled-out Self-Review Questionnare on Security and Privacy doesn't seem to resolve. Can someone post it here?

cynthia commented 5 years ago

Taken up during the Tokyo F2F. @hober @travisleithead @kenchris @dbaron @cynthia discussed this at length.

Interesting problem! Thanks for bringing this to our attention. This is absolutely worth exploring.

That said, we do have concerns about the limited audience that will immediately benefit from this.

Additionally, we are a bit curious what is supposed to go on this task queue, and how are they expected to behave.

We think a solution in this space will need to account for how legacy scheduling APIs interact as not all authors will be expected to switch to the new model for scheduling (as long as the old model is still around).

We'd love to see this advance further, and would like to see some strawman proposals when you have any.

spanicker commented 5 years ago

people might choose between say, Promises and setTimeout based on API familiarity or usability, even when they have non-obvious differences in semantics that are important. Yes that is right. The introduction of async / await has compounded this by steering developers towards microtask timing, when they are really making a syntax choice without considering the semantics.

  • if we should have a more flexible relationship between the syntaxes and the timings, e.g., better ability to choose the timings when using a given syntax (at least for some of the important ones)

Would love to hear your ideas. I'd floated the idea of additional syntax for promise, but we also really should address async / await. Example: Promise.resolve().thenYield(bar); OR myPromise .then(Priority.Default) .then(value => { ... }); OR myPromise .thenAtPriorityDefault() etc.

  • whether the builtin API for task queues proposed here makes that problem better or worse -- which I suspect may depend on how easy/hard it is to hook all the existing sorts of delayed work we have in the platform up to these task queues (in lieu of their current defaults).

the builtin API certainly aims to make this better with "semantic priority", however we have to really pick appropriate names here (work-in-progress): https://github.com/spanicker/main-thread-scheduling#1-immediate-priority https://github.com/spanicker/main-thread-scheduling#2-render-blocking-priority-render-immediate etc.

spanicker commented 5 years ago

The link to the filled-out Self-Review Questionnare on Security and Privacy doesn't seem to resolve. Can someone post it here?

This is not a specification yet and there will be separate specifications and TAG requests for each API. Example: https://github.com/WICG/is-input-pending#privacy-and-security

That said, we do have concerns about the limited audience that will immediately benefit from this.

Could you elaborate on the concern of limited audience, why do you think it is limited? For example if we can get few popular frameworks (eg. React, Vue etc) and widely used apps (Facebook, Maps) to adopt the APIs, that in turn results in relatively wide adoption: both in terms of type of apps and chromium (pagevisit) usecounter etc.

Additionally, we are a bit curious what is supposed to go on this task queue, and how are they expected to behave.

I will link the explainer for this proposal when ready (soon!).

shaseley commented 5 years ago

We've made some progress on the API shape here, and are moving ahead with two APIs—scheduler.postTask() and scheduler.yield(). They are closely related (both part of window.scheduler, and tied together by priority), but are useful on their own and can ship independently.

The explainer has been updated: Overview Prioritized post task Yield and continuation

We also have a design doc which covers both APIs.

bzbarsky commented 5 years ago

The proposal assumes that everything at a given priority is in fact linearized into a single FIFO queue, and that the set of priorities is the fixed one in the proposal. This is at odds with the way the HTML spec defines independent unordered task sources and with browser ability to reorder those wrt each other, as far as I can tell.

shaseley commented 5 years ago

@bzbarsky Thanks for raising this concern.

Integrating the proposed prioritized task queues with the existing unprioritized task queues is certainly a challenge and is an open question at this point. It is not our goal to remove the ability for UAs to reorder independent task sources, but there is an open question around how prioritized task queues should fit in.

There are a few questions to consider here:

  1. How should tasks from prioritized task queues be ordered with respect to each other for task queues created by a specific Scheduler?

    This is all about establishing a priority system for the postTask API, which is a primary goal of the API. There are a lot of options here, many of which are discussed here. We could leave this up to browsers to decide, but this comes at the cost of less developer certainty.

  2. How should tasks from all prioritized task queues be ordered with respect to each other, regardless of which Scheduler creates them?

    This is about scoping (1). Should this be global? You bring up a good point that the current spec preserves order across documents that share an event loop, but I worry about things like high priority script on background pages or in off-screen frames, etc. Maybe throttling will suffice here?

  3. How should tasks from prioritized task queues and other task queues be ordered?

    As mentioned earlier, it isn’t our goal to limit the ability of UAs to prioritize the various task sources. And IMO one of the biggest open and most important questions of this API is how existing task sources and rendering integrate with respect to tasks in prioritized task queues. For example, do we guarantee that the highest priority (called immediate for now) is render-blocking? What about high priority with respect to timers, or other methods developers currently use to schedule tasks?

    Providing developers with some minimum guarantees would lead to more predictable behavior, which is a good thing. At the same time, it’s important to provide flexibility to allow browsers to experiment and optimize performance, and in this case, perhaps alleviate concerns around starvation caused by high priority work.

    Our current thinking here is that speccing the priority of certain task sources, specifically those that are alternatives methods of scheduling (setTimeout/setInterval, potentially postMessage) would be very beneficial to both this API and developers. We would evaluate other task sources in response to developer concern/feedback, and if needed, each resulting in a separate spec change proposal.

We’d be happy to hear any further thoughts you have on this.

dbaron commented 5 years ago

It feels like the main use case given in the overview for the prioritized post task API is about coordination within a single site -- although this does include third party context in nested browsing contexts.

I agree with @bzbarsky that it's valuable to allow browser flexibility to schedule differently across third-party boundaries; such boundaries are quite opaque to begin with and it feels like heavily constraining the relative priorities between origins could interfere with useful things that browsers could do to help users. Whereas within a single site, where there's already much closer interaction, it feels more like there's value in having things clearly-specified -- as the HTML spec currently does, since there's only a single task source. So specifying these priorities as being something meaningful within a single task source, but not across task sources, feels like a good approach to me.

And given the comments in the explainer it sounds like this is the approach currently being taken.

kenchris commented 5 years ago

Generally, I think it is great to see work done in this area, but I am also a bit worried that this will make the average developer's life more complicated.

Especially, it might be hard to understand what priority to use (high, low, default) so it is important that frameworks and libraries don't force people to make that decision when not needed. Also, today people use a ton of libraries and all of these might use this API internally and set their own prioritization that might conflict with what I am trying to do. So it needs to be very easy to see what a library is doing and change the priority. If every library ends up creating their own APIs for doing so, it might turn into a mess.

I also find the name "immediate" confusing as it doesn't run immediately, as it runs after microtasks.

Should it be possible to access something like the micro task queue using this API?

dbaron commented 5 years ago

Next steps:

bzbarsky commented 5 years ago

@dbaron There is not a single task source for a single "site". There are a bunch of different task sources, which per HTML spec are unordered with each other. For example, setTimeout, most DOM-related tasks, user-interaction-triggered tasks, postMessage are all different task sources. There's a bunch of other ones as well; those are just the ones I know off the top of my head.

My comments were explicitly about the single-site situation, not the different-site situation; in the different-site case there are no observable ordering guarantees between the two sites apart from ordering within communication channels like postMessage, so there is no problem there; I hadn't even thought about people reading it that way...

dbaron commented 5 years ago

Ah, ok -- I'm curious how much flexibility browsers really have for some of these things -- it feels like some of that might be pretty constrained by web compatibility.

bzbarsky commented 5 years ago

Some of it may be (and if it is, then the spec should change to reflect that), but some of it is definitely not. For example, Firefox made major changes to its setTimeout scheduling recently and it wasn't a problem. And last I checked Firefox and Safari had different scheduling behavior (from each other) for postMessage without running into web compat issues.

kenchris commented 5 years ago

With the .cancel() would it makes sense to use AbortController so that you could bulk cancel things? @domenic

domenic commented 5 years ago

Yes, I strongly recommend replacing the .cancel() with an AbortController-based design. In fact I think the design could be changed to just return promises, moving other manipulation methods to a subclass of the AbortController (similar to our plans for FetchController). This would allow eliminating the Task class altogether, which IMO would be a big simplification for the platform, as right now Task occupies much of the same design space as Promise.

shaseley commented 5 years ago

@kenchris

Generally, I think it is great to see work done in this area, but I am also a bit worried that this will make the average developer's life more complicated.

Especially, it might be hard to understand what priority to use (high, low, default) so it is important that frameworks and libraries don't force people to make that decision when not needed.

We're hoping to change these to more semantically meaningful names to clear up some of the confusion about which priority is appropriate. GCD has something similar now after starting with more generic names.

Also, in the current API shape priority is an optional argument that defaults to "default", which we'd expect to have similar behavior to other unprioritized methods devs currently have of scheduling script (though some of this depends on the other discussions here about integrating with other task sources).

Also, today people use a ton of libraries and all of these might use this API internally and set their own prioritization that might conflict with what I am trying to do. So it needs to be very easy to see what a library is doing and change the priority. If every library ends up creating their own APIs for doing so, it might turn into a mess.

This is on our radar as well, but it’s unclear if there’s something we can do through the postTask API and if it affects the API shape, or if we need a separate solution, such as some type of "priority sandboxing". It feels like the latter to me, especially given that controlling when 3P script runs is a problem today without the postTask API, but this still needs to be fleshed out.

I also find the name "immediate" confusing as it doesn't run immediately, as it runs after microtasks.

I agree that a more semantically meaningful name would be helpful here, which we plan to address.

Should it be possible to access something like the micro task queue using this API?

This might be a useful future addition. There is currently the queueMicrotask() API that does this, but adding a "microtask" priority could be beneficial in terms of creating a cohesive scheduling API.

shaseley commented 5 years ago

@kenchris @domenic

Yes, I strongly recommend replacing the .cancel() with an AbortController-based design. In fact I think the design could be changed to just return promises, moving other manipulation methods to a subclass of the AbortController (similar to our plans for FetchController). This would allow eliminating the Task class altogether, which IMO would be a big simplification for the platform, as right now Task occupies much of the same design space as Promise.

Thanks for the suggestion. My fear is that by replacing useful, common scheduling abstractions like Task and TaskQueue with a strict controller/signal/promise design, that the ergonomics will suffer and we’ll make the API for a complex area even more complex.

At the same time, being able to combine other signals with this API is powerful and we’re exploring alternate API shapes that support/use signals/controllers, and I’ve filed an issue in the explainer repo to track this. One powerful use case outside of abort is changing priority, if we support that for other async APIs like fetch().

@domenic do you have a reference for FetchController? I couldn’t find a proposal.

shaseley commented 5 years ago

@bzbarsky

Did you have any thoughts about specced prioritization between prioritized postTask tasks and other task sources for the "single-site case" ((3) in this comment)?

bzbarsky commented 5 years ago

My impression, and maybe I misunderstood, is that in the current proposal all existing task sources are lumped into the "default" prioritized task source, with a global FIFO order imposed on them.

If the intent is that this is not the case, and that the prioritized task sources are just adding new task sources, that are unordered wrt each other and with existing task sources, then that eliminates my concerns about the system being too rigid.

That said, it's also not clear to me whether "high priority" means "will run before anything that's low priority" or "will run with higher probability" or something else, or whether this will be up to browsers.

domenic commented 5 years ago

Thanks for the suggestion. My fear is that by replacing useful, common scheduling abstractions like Task and TaskQueue with a strict controller/signal/promise design, that the ergonomics will suffer and we’ll make the API for a complex area even more complex.

This feels very similar to the arguments people made in the past to stick with legacy callback-based APIs, and resist moving to promises. I'm hoping to put together something similar to the promises guide soon, which plays a similar role in helping make it clear that the ecosystem has transitioned and that new specs should follow.

I will note that APIs that were designed during the transition period, and chose to stick with callbacks, eventually had to undergo a painful migration to promises. (Notably WebCrypto and WebRTC.)

@domenic do you have a reference for FetchController? I couldn’t find a proposal.

https://github.com/whatwg/fetch/issues/447#issuecomment-281731850 is what I could find, although it doesn't seem super up-to-date, e.g. I would expect there to be inheritance of FetchController from AbortController and FetchSignal from AbortSignal, instead of duplication of the relevant members.

shaseley commented 5 years ago

My impression, and maybe I misunderstood, is that in the current proposal all existing task sources are lumped into the "default" prioritized task source, with a global FIFO order imposed on them.

No, that's not the intention. From my previous reply: "It is not our goal to remove the ability for UAs to reorder independent task sources, but there is an open question around how prioritized task queues should fit in."

I also mentioned that we are also considering proposing to specify the priorities of certain task sources, so that it's clear how alternative methods of scheduling (e.g. setTimeout) fit into this model, or maybe at least specifying a maximum priority (e.g. setTimeout shouldn't be higher priority than default)? This is about giving developers some scheduling guarantees. Do you think this would be too rigid as well?

(I'll also update the explainer to clarify this.)

If the intent is that this is not the case, and that the prioritized task sources are just adding new task sources, that are unordered wrt each other and with existing task sources, then that eliminates my concerns about the system being too rigid.

That said, it's also not clear to me whether "high priority" means "will run before anything that's low priority" or "will run with higher probability" or something else, or whether this will be up to browsers.

Our current thinking is that the order that prioritized tasks run should be specified, rather than left up to browsers. We aren't absolutely certain what that behavior should be yet, but we have some initial thoughts.

The approach we’re experimenting with is that tasks are run in strict priority order, from highest to lowest. This obviously runs the risk of starving lower priority tasks, but task priorities can be dynamically changed by client code so that starvation-prevention can be implemented in userspace, for whatever fits application requirements (which varies, from what we’ve seen).

The appeal of this approach is that it is simple for developers to reason about, both from a correctness and performance standpoint, since (their) high priority work is guaranteed to run before low priority work.

We could also add native platform-exposed support for starvation-prevention, like timeouts or aging. There's some concern, however, about deadline interchange accidentally starving actual high priority work and degrading the user experience.

If the motivation for wanting flexibility of ordering these prioritized task sources (and please correct me if I'm wrong) is to prevent starvation of lower priority work, I think there's a question of where starvation-prevention should be implemented, by browsers or app developers? We've been siding on the developer side in this case since they better know their app requirements, but I think this is a difficult question.

bzbarsky commented 5 years ago

The approach we’re experimenting with is that tasks are run in strict priority order, from highest to lowest.

So just to make sure I understand the proposal:

1) This applies to the tasks that userspace (i.e. web page) code has queued via the prioritized task system. 2) It does not apply to specification-defined tasks (at least existing ones; future specs may want to use the prioritized task setup). These can be scheduled by the browser in any desired way around the explicitly-prioritized tasks, and the browser is responsible for preventing starvation.

If the motivation for wanting flexibility of ordering these prioritized task sources (and please correct me if I'm wrong) is to prevent starvation of lower priority work

Just to be clear, I think we need flexibility in the following:

1) Ordering various existing task sources against each other. There is ongoing research in browsers on the best way to do that, and I don't think we're at a point where we can say what works best here and pin it down. 2) Preventing of starvation of browser-generated tasks of various sorts. Developers may not know what these all are and while some of them might be low-priority when initially queued that doesn't mean they're infinitely-starvable. Maybe this can be represented as priority changes, or maybe it gets represented by schedulers that prevent indefinite starvation, but fundamentally the client code doesn't have enough information to know whether starving specific non-client tasks is OK or not, and we shouldn't allow it do to that.

I don't have strong views about how starvation-prevention should work for userspace-generated prioritized tasks, but I would like to suggest that writing a userspace starvation-preventer is all fine and dandy for a Google or a Facebook that completely controls all the code on a page, but expecting every single consumer of the API to do it is a bit of a red flag for me, when the typical consumer of the API might just be someone who includes a few different libraries in whatever their website is, then ends up with those libraries all using this system independently and not playing well together. At that point there is no way for them to fix the problem, really, and I'm not sure the "they better know their app requirements" condition holds either.

shaseley commented 5 years ago

So just to make sure I understand the proposal:

  1. This applies to the tasks that userspace (i.e. web page) code has queued via the prioritized task system.
  2. It does not apply to specification-defined tasks (at least existing ones; future specs may want to use the prioritized task setup). These can be scheduled by the browser in any desired way around the explicitly-prioritized tasks, and the browser is responsible for preventing starvation.

Yes, with the addition of maybe proposing spec changes to a few existing task sources.

If the motivation for wanting flexibility of ordering these prioritized task sources (and please correct me if I'm wrong) is to prevent starvation of lower priority work

Just to be clear, I think we need flexibility in the following:

  1. Ordering various existing task sources against each other. There is ongoing research in browsers on the best way to do that, and I don't think we're at a point where we can say what works best here and pin it down.
  2. Preventing of starvation of browser-generated tasks of various sorts. Developers may not know what these all are and while some of them might be low-priority when initially queued that doesn't mean they're infinitely-starvable. Maybe this can be represented as priority changes, or maybe it gets represented by schedulers that prevent indefinite starvation, but fundamentally the client code doesn't have enough information to know whether starving specific non-client tasks is OK or not, and we shouldn't allow it do to that.

Agreed, and I’d be excited to see what browsers find w.r.t. (1) :).

I don't have strong views about how starvation-prevention should work for userspace-generated prioritized tasks, but I would like to suggest that writing a userspace starvation-preventer is all fine and dandy for a Google or a Facebook that completely controls all the code on a page, but expecting every single consumer of the API to do it is a bit of a red flag for me, when the typical consumer of the API might just be someone who includes a few different libraries in whatever their website is, then ends up with those libraries all using this system independently and not playing well together. At that point there is no way for them to fix the problem, really, and I'm not sure the "they better know their app requirements" condition holds either.

We expect a small number of libraries (including React) would provide this functionality so not every consumer would need to implement starvation prevention. We've found that different apps have different requirements when it comes to starvation prevention, and want to get more real world experience before baking something into the platform. But, we agree that it will need to be addressed and it's an area that we'll continue to explore.

shaseley commented 4 years ago

Hi folks, an update on the postTask API proposal:

  1. We've overhauled the API shape to use the controller/signal pattern, and the explainer reflects the change. Thanks @domenic for your comments and help along the way, and to @kenchris as well for raising this here. The new API shape was presented at WebPerfWG in November (minutes here).

  2. The language around how we plan to integrate the priorities has been updated in the explainer as well, reflecting the discussions in this thread.

  3. FYI: we're planning to get this to origin trial very soon, and will be sending the I2E shortly.

kenchris commented 4 years ago

Hi there

What are your current priorities? postTask? Where can be we (TAG) most helpful.

Maybe it makes sense to have separate issues for each of these APIs so that we better know what to focus on and better can track the progress?

dbaron commented 4 years ago

Also worth noting that #415 has some discussion that's probably relevant to isInputPending.

shaseley commented 4 years ago

Hi there

What are your current priorities? postTask? Where can be we (TAG) most helpful.

Yes, postTask is our current priority, and a TAG review of postTask would be most helpful.

Maybe it makes sense to have separate issues for each of these APIs so that we better know what to focus on and better can track the progress?

Sorry for the confusion, yes separate issues sounds like the right approach. Should we repurpose this for postTask, or file a new issue? Much of the content of this thread is already related to postTask.

kenchris commented 4 years ago

What happens in this case?

const controller = new TaskController('user-blocking');
scheduler.postTask(doWork, { signal, priority: 'background' });
shaseley commented 4 years ago

What happens in this case?

const controller = new TaskController('user-blocking');
scheduler.postTask(doWork, { signal, priority: 'background' });

signal gets downcast to an AbortSignal, allowing the abort component of the signal to be propagated while supporting the ability to spawn different priority subtasks.

A use case we had in mind is if devs want to post independent low priority subtasks that should still be canceled with the parent task. One example of this might be logging or cleanup tasks.

Note: I'm working on a separate explainer/issue to explore implementing TaskSignal this way (as an extension of AbortSignal) vs. having/supporting separable signals (e.g. PrioritySignal), and potentially making TaskSignal a composite. Where I think things get complicated is if/when more signals get added to the platform and how they interact with the various APIs. I'm planning to loop TAG folks and others in on that as well once it's written.

kenchris commented 4 years ago

But what priority is it going to get?

shaseley commented 4 years ago

Oh sorry, 'background'.

kenchris commented 4 years ago

So priority wins over the priority from the TaskController. But I assume that if I change the priority of the TaskController then that wins? Could you add that example and explanation?

shaseley commented 4 years ago

Yes, the priority option acts as an override and wins over the priority from the signal, but the priority remains fixed at that priority, meaning controller.setPriority() won't change the priority for that task. In this example, that means the priority of that task will always be background. The invariant here is that specifying a fixed priority, whether or not a signal is provided, will cause that task to always remain at that priority. That's what I meant by the signal is downcast to an AbortSignal if a priority is provided.

This is mentioned in the design doc, but looks like it didn't make it into the explainer; I'll add this information and an example.

There are other options here, e.g. ignoring the priority option, throwing an error. The current approach seemed to enable additional use cases which is why we selected it, but we're open to suggestions/feedback.

kenchris commented 4 years ago

Clarification looks good but I found a nit

Should be prioritychange as I assume that TaskSignal is an EventTarget (it should be) you can call addEventListener('prioritychange') in addition to .onprioritychange = ...

cynthia commented 4 years ago

I think we're pretty happy with the direction this proposal is taking, so we propose to close this for now. If there is a formal spec that needs to be reviewed later on, we'd be happy to revisit this.