keepalive: Do we need to restrict the number of requests at a time?

yutakahirano commented 6 years ago

We are planning to implement and ship the keepalive flag. At the intent-to-ship thread, some people want to restrict the number of requests at a time, as well as the total number of inflight bytes. Do you think it's reasonable?

@yoavweiss @annevk @igrigorik

annevk commented 6 years ago

We could I suppose. Did sendBeacon() originally have that?

Note that we already have a restriction on the total number of bytes.

yoavweiss commented 6 years ago

sendBeacon() doesn't have a limitation on the number of requests in spec (nor in implementations as far as I know).

annevk commented 6 years ago

I wonder if @davidben has suggestions for what the request limit should be.

I suppose we can also somewhat more aggressively send H2 RST_STREAM/H1 close connection once there is a response?

igrigorik commented 6 years ago

We are planning to implement and ship the keepalive flag. At the intent-to-ship thread, some people want to restrict the number of requests at a time, as well as the total number of inflight bytes.

To clarify, we already spec 64KB limit for inflight bytes.

sendBeacon() doesn't have a limitation on the number of requests in spec (nor in implementations as far as I know).

Indeed. Nor, afaik, has this ever come up as a problem based on current sendBeacon deployments. FWIW, while it's tempting to be really aggressive here, it's also important to recognize that if we bend this too far, developers will simply revert to their existing tricks — sync XHRs, spinning in while loops in click handlers, and so on.

@yutakahirano @davidben what's the motivation for limiting number of requests, in addition to existing restrictions?

sleevi commented 6 years ago

@igrigorik I chimed in on the blink-dev@ thread, but one obvious implication is that these connections can keep resources consumed that the user cannot otherwise free (short of restarting the browser), and is not attributable to the site (instead, it just looks like memory is being leaked)

Much in the way we limit the number of connections per origin (6) or total number of sockets (256), there's going to be an inherent limit. One of the risks/implications for sendBeacon() is that a slow sendBeacon() that doesn't time-out can otherwise interfere with the user's ability to browse, by keeping a connection active and engaged. Closing the renderer no longer serves as a way to free those connections (by freeing the associated request contexts, which are terminated at the end of the renderer lifecycle)

annevk commented 6 years ago

To be clear, and as I tried to say in the thread, I'm fine with adding a limit. I was just raising the point that you might not have enforced one thus far. Is 64 requests reasonable? 16? Something that's not a power of 2?

sleevi commented 6 years ago

Right, I think the non-enforcement was an oversight we should try to correct. In the best/worst case, we would limit to 256 parallel requests - but I’m not sure if we’d limit total pending.

I think the limit may vary on device, and thus be implementation defined with advisory (SHOULD) guidance. At some point, these are consuming resources that are not associated with a browsing context, and thus we need overall limits and per-origin limits. Where do we think the upperbound of resource usage is? 1024KB seems a rather substantial amount of information to allow to be queued - but it depends on device and network, and allows quick consumption by a single origin/navigation context.

On Mon, Jan 15, 2018 at 6:47 PM Anne van Kesteren notifications@github.com wrote:

To be clear, and as I tried to say in the thread, I'm fine with adding a limit. I was just raising the point that you might not have enforced one thus far. Is 64 requests reasonable? 16? Something that's not a power of 2?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/whatwg/fetch/issues/662#issuecomment-357749703, or mute the thread https://github.com/notifications/unsubscribe-auth/ABayJ3iEjLuTeUUmf5F-anugiaG_1qGIks5tK48XgaJpZM4ReaZf .

annevk commented 6 years ago

Note that they can stuff information in headers and URLs too, so you might be looking at much more than 1 mebibyte.

sleevi commented 6 years ago

Doesn’t that make any limit on sendBeacon() silly if it doesn’t account for that? I agree, though, that we should carefully consider that.

To clarify my concerns: Implementations are limited in the strategies they can employ - they can enqueue in memory, thus growing memory usage, or they can back to disk, consuming disk usage. In both cases, implementations will be bounded both by what is reasonable for their constraints and what does not impair user experience or expectations.

However, I also acknowledge that any explicit quota creates issues both for interop and for visibility - interop is affected if developers cannot discover or detect the limits, while exposing that potentially reveals cross-origin information, such as the size of responses other origins are enqueuing. This applies to sendBeacon and keepalive, but I don’t think it applies to sync XHR - which is unfortunate in many ways.

On Mon, Jan 15, 2018 at 7:04 PM Anne van Kesteren notifications@github.com wrote:

Note that they can stuff information in headers and URLs too, so you might be looking at much more than 1 mebibyte.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/whatwg/fetch/issues/662#issuecomment-357753371, or mute the thread https://github.com/notifications/unsubscribe-auth/ABayJ4s2Sxif98k1EYEFZZRQH467mn8oks5tK5MogaJpZM4ReaZf .

igrigorik commented 6 years ago

Ryan, thanks for the additional context — makes sense.

Right, I think the non-enforcement was an oversight we should try to correct. In the best/worst case, we would limit to 256 parallel requests - but I’m not sure if we’d limit total pending.

I think the limit may vary on device, and thus be implementation defined with advisory (SHOULD) guidance

Seems like a good addition. My hunch is that we could get away with a lower limit. On that note, as an option, we could consider a higher hard ('must') limit, with a "UA's may lower this limit in appropriate circumstance" or some such?

annevk commented 6 years ago

I'd prefer to have a per-document/worker lower bound (or maybe per origin is better), if possible, so developers know what they can normally rely on.

davidben commented 6 years ago

Users don't observe workers, just tabs or sites, and documents may spawn lots of workers, so that suggests not giving all workers a separate limit.

yutakahirano commented 6 years ago

SharedWorker and ServiceWorker can be associated with multiple documents so they'd be tricky. By the way, ServiceWorker is much stronger than fetch API + keepalive. Do we have any similar limitations on it?

sleevi commented 6 years ago

Part of it may be implementation specific, but the memory overhead of ServiceWorker is partially bounded by the out-of-process nature. I also thought we handled ServiceWorkers' (implementation-defined) limits in https://github.com/w3c/ServiceWorker/issues/527 ?

yutakahirano commented 6 years ago

Per-document/worker limit is avoidable as @davidben says. per-origin limit is avoidable by creating many third-party iframes (www1.example.com/, www2.example.com/, ...). per-tab limit may be close to user's expectation but it's not working well with SharedWorker / ServiceWorker. I can think of two policies sounding somewhat sane to me.

Solely rely on timeout. When a user closes a tab, all network resources for the tab will be released after N seconds. There's no limit for the number of inflight keepalive requests.
Have a per-tab limit (i.e., M keepalive requests per tab including iframes / dedicated workers). keepalive requests are not supported in ServiceWorker and SharedWorker.

What do you think?

annevk commented 6 years ago

By per-tab, do you actually mean a unit of related browsing contexts? Otherwise window.open() / <a target=x> would be escapes (albeit somewhat guarded in most situations). It doesn't seem acceptable to not support them in service workers though. How would you pass such a request through otherwise?

yutakahirano commented 6 years ago

Then what do you think about timeout? Users expect all network resources for a tab will be released when it's closed. Is it reasonable to modify the expectation a bit, say, all network resources for a tab will be released when N seconds passed since it's closed?

annevk commented 6 years ago

I think that's reasonable, given that we already embraced that for service workers.

yutakahirano commented 6 years ago

@igrigorik @davidben @sleevi What do you think about the idea? Do you think it's good to add timeout sentences to the spec instead of restricting the number of requests? As stated above, web developers can escape the restriction on the number of requests if they are determined, so timeout is a simpler and no less solid solution, I think.

sleevi commented 6 years ago

@yutakahirano I don’t think that resolves the interoperability concern. I do not believe a timeout would be sufficient in Chrome, for example - we would and should, as a matter of both security (Browser DoS) and stability (aforementioned reasons) - impose limits on how much we will accept as pending. We may not choose to specify it here, but it will still be web observable, so I don’t think a timeout solution helps address the fundamental interoperability concerns being raised here.

I do think a timeout helps somewhat with predictability though, and I think it is a good additional check, and I think helps set some bound on user expectations being violated - and I think those are all good things.

igrigorik commented 6 years ago

@sleevi @yutakahirano with your Chrome hat on, what values would you enforce (or propose to) today both for timeout and pending?

sleevi commented 6 years ago

@igrigorik

So, with timeout, the primary concern I think is aligning with user expectations. The expectation that, by navigating away from a page (or closing a page), any resources or affects had are terminated, within some defined bounds. Users conditioned to close tabs to make browsing faster (by freeing resources) are an example of that. So I think the discussion of timeout is largely one of how predictable it is for users, and so I suspect values there will be somewhat subjective, but let's throw out 15-30 seconds as a possible anchor for discussion (which ignores things like size of data versus user's bandwidth, which could be different for mobile versus desktop).

For pending, I think it's equal parts implementation concern and user expectations, which makes it trickier. Let's presume, for sake of discussion, a Chrome-like multi-process model, and an implementation in which keepalive (and sendBeacon) fully live within the renderer process. In this implementation approach, pending is not really an implementation concern per-se (as the renderer's memory will be bounded, and eventually in an OOM situation it'll be first to die - much like attempting to load an overly large SVG file). It is, however, a user-expectation concern, since if renderer process cannot be killed until after processing keepAlive/sendBeacons, then zombie renderers (or service workers) can create negative user experiences - which is why timeout exists to reap them. However, if an implementation strategy chose to instead place keepalive and sendBeacon in a browser process, and allow the renderer to shut down, then pending is much more important - the ability for content to keep browser-process memory pegged becomes much more problematic, and it's an implementation concern needing to bound the ability for 'hostile' content to affect other navigations/interactions and the overall stability. Further, the amount you can dedicate to this sort of process will itself be bounded by the platform - 512KB might seem a drop in the bucket on a 2GB desktop user, but be unconscionably large on a 512MB device like an Android phone or Apple watch.

I realize this is a mixture of constituencies - user expectations like David highlighted on blink-dev@ and implementation concerns - and developers fit right in the middle of those two, but it may help explain some of the tension.

yutakahirano commented 6 years ago

@igrigorik We (Chrome) already have 30-sec timeout. It's implemented in multiple places for some reasons, but see content::KeepAliveHandleFactory for example. We also have a restriction on the number of requests in a way in content::ResourceDispatcherHostImpl::HasSufficientResourcesForRequest.

As for the pending point in @sleevi's comment, it sounds that it is important regardless of keepalive flag. Chrome already has some protection. I'm open to have such protection in the spec, and change Chrome's implementation if needed, but I don't think it's keepalive specific. Does this make sense?

sleevi commented 6 years ago

@yutakahirano I'm not sure I understand why you suggest it's not keepalive specific. Can you think of another place in which an origin can continue to consume resources on the user's device, despite their being no navigation context to that domain, and without any ability for the user agent to intervene on the user's behalf?

annevk commented 6 years ago

@sleevi the user agent can always intervene, but service workers is a feature that meets those characteristics as mentioned upthread.

sleevi commented 6 years ago

@annevk Except Service Workers have an implementation defined escape hatch (as noted in https://github.com/w3c/ServiceWorker/issues/527 ), whereas this provides no such implementation escape hatch that I can see. The original suggestion on blink-dev had been to ensure such escape hatches existed in keepalive/sendBeacon, but then it seems from https://github.com/whatwg/fetch/issues/662#issuecomment-357708240 onwards the focus became on an interoperable limit, rather than an implementation-defined escape hatch for interventions.

annevk commented 6 years ago

I see. I think I'd be okay with leaving this implementation-defined for now if that helps (not sure if this is blocking on me at this point). I prefer interoperable limits, interoperable OOM even, but you can't have it all.

igrigorik commented 6 years ago

@sleevi thanks, I think we're on the same page with respect to motivation + tension.

@igrigorik We (Chrome) already have 30-sec timeout. It's implemented in multiple places for some reasons, but see content::KeepAliveHandleFactory for example. We also have a restriction on the number of requests in a way in content::ResourceDispatcherHostImpl::HasSufficientResourcesForRequest.

@sleevi given the above, what's the implementation delta (if any) between what you're asking for and what's implemented in Chrome. Or, is the remaining AI here on spec side to document these mechanisms?

sleevi commented 6 years ago

On Thu, Jan 25, 2018 at 10:10 AM Ilya Grigorik notifications@github.com wrote:

@sleevi https://github.com/sleevi thanks, I think we're on the same page with respect to motivation + tension.

@igrigorik https://github.com/igrigorik We (Chrome) already have 30-sec timeout. It's implemented in multiple places for some reasons, but see content::KeepAliveHandleFactory for example. We also have a restriction on the number of requests in a way in content::ResourceDispatcherHostImpl::HasSufficientResourcesForRequest.

@sleevi https://github.com/sleevi given the above, what's the implementation delta (if any) between what you're asking for and what's implemented in Chrome. Or, is the remaining AI here on spec side to document these mechanisms?

Fairly large, based on the limited data shared so far. Those limits do not address the concerns raised. We are working to better understand how this feature was implemented in Chrome, given the apparent lack of mitigations for both starvation and DoS.

—

You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/whatwg/fetch/issues/662#issuecomment-360494140, or mute the thread https://github.com/notifications/unsubscribe-auth/ABayJ2sicFah3D9rl2tIBeGFJVGmjUkfks5tOJkZgaJpZM4ReaZf .

yutakahirano commented 6 years ago

Closing this bug as we decided to have a Chrome-specific limit.

annevk commented 6 years ago

At the very least the specification needs to be changed to allow for that, no?

yutakahirano commented 6 years ago

I think cancelling requests when UA doesn't have sufficient resources is already allowed, right?

annevk commented 6 years ago

OOM sure, but this is canceling for tracking purposes. I think that's worth calling out on its own so implementers consider it.

sleevi commented 6 years ago

Can you indicate what you mean by canceling for tracking purposes?

The proposal here is to bound resources and cancel when those resources are exhausted.

On Mon, Feb 19, 2018 at 3:45 AM Anne van Kesteren notifications@github.com wrote:

OOM sure, but this is canceling for tracking purposes. I think that's worth calling out on its own so implementers consider it.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/whatwg/fetch/issues/662#issuecomment-366622158, or mute the thread https://github.com/notifications/unsubscribe-auth/ABayJ8IKbcU_nDsrx-utf4xBNZuOtj-lks5tWTS3gaJpZM4ReaZf .

annevk commented 6 years ago

Sorry, got things mixed up. Maybe this is fine as-is then. Let's close it for now.

whatwg / fetch

keepalive: Do we need to restrict the number of requests at a time? #662