Request Limit - Githubissues

mingyc commented 10 months ago

Open questions from last meeting:

How to limit: document-wide limit, tab-wide limit, limit per top-level origin, limit per reporting origin?
Total size limit: 64K for each reporting origins, overall quota for 640K for entire tab?
Permissions policy: default on or off?

Some concerns:

Total limit should not be easily consumed/abused by a single origin
Origin-specific info should not be leaked via quota consumption
A single request can easily take up 50k (e.g. Boomerang)

Relevant Discussions: https://github.com/w3c/beacon/issues/38#issuecomment-1861006131

@noamr @annevk @nicjansma @yoavweiss

nicjansma commented 6 months ago

Agreed - the current proposal seems like the best compromise.

I'm just thinking through the developer ergonomics. I want to have simple code that can utilize fetchLater() without having to think too much. If I have to worry about my code encountering a QuotaExceeded during runtime (after starting out OK), it has to then deal with understanding what that means, and work around it (e.g. switch to sendBeacon() later instead or something). We could provide some sample code that shows how devs may want to deal with this scenario, for example.

The other option I was thinking was if a reporting origin ever registered a fetchLater() it would be assigned a minimum 64kb that it could use up to. Then, as long as QuotaExceeded isn't encountered at the first fetchLater(), scripts wouldn't have to worry about it firing later down the line. But then that's essentially a 10-reporting-origins-limit instead.

Yoav and I were discussing whether or not we could estimate how often 10+ reporting origins might happen in the wild. I was thinking I could look at our RUM ResourceTiming data for beacon initiators, but then there's a lot of scripts that still use fetch() with keepalive, or XHR, or IMG gets. Still, I might try to see what I can estimate.

noamr commented 6 months ago

Agreed - the current proposal seems like the best compromise.

I'm just thinking through the developer ergonomics. I want to have simple code that can utilize fetchLater() without having to think too much. If I have to worry about my code encountering a QuotaExceeded during runtime (after starting out OK), it has to then deal with understanding what that means, and work around it (e.g. switch to sendBeacon() later instead or something). We could provide some sample code that shows how devs may want to deal with this scenario, for example.

The other option I was thinking was if a reporting origin ever registered a fetchLater() it would be assigned a minimum 64kb that it could use up to. Then, as long as QuotaExceeded isn't encountered at the first fetchLater(), scripts wouldn't have to worry about it firing later down the line. But then that's essentially a 10-reporting-origins-limit instead.

Yoav and I were discussing whether or not we could estimate how often 10+ reporting origins might happen in the wild. I was thinking I could look at our RUM ResourceTiming data for beacon initiators, but then there's a lot of scripts that still use fetch() with keepalive, or XHR, or IMG gets. Still, I might try to see what I can estimate.

If we enforce a 10-origin limit, you could register early by fetching an empty (GET?) request and replacing it with a real payload when you have it. It's not even that wasteful. But I suggest that perhaps we can proceed with what we spec'ed here, and introduce this extra limitation if we see that this becomes a problem?

mingyc commented 5 months ago

@yoavweiss We need more developer feedback before settling on this approach. Is it possible for you to bring this up in the next meeting?

nicjansma commented 5 months ago

We discussed on the April 25th W3C WebPerfWG call. Meeting minutes are located here.

Summary:

A refresher of the issue was presented, and there were a few questions around how the quota is actually applied (e.g. per-document, per-origin, delegated to cross-origin frames, etc).
Some alternative approaches were discussed (e.g. not quota, but prioritizing smallest payloads at page exit), as well as the concerns around perverse incentives (e.g. race-to-be-first for the quota).
Another suggestion was to have a guaranteed fallback mechanism, so a "small" amount of data could get out, e.g. a 0-payload but just packing some data on the URL. Or to use FIFO queues, but that has its own drawbacks.

No big decisions were made, but the request was for everyone interested to comment in this issue.

noamr commented 5 months ago

In the meantime I updated the fetch PR, to have the quota take the URL+headers into account.

mingyc commented 5 months ago

https://github.com/WICG/pending-beacon/issues/87#issuecomment-1985358609

The default permission policy is anyway not something in the spec, but in most/all cases it should be self.

@arturjanc @yoavweiss @noamr We are concerned that setting the default deferred-fetch permissions policy to self may significantly reduce the usage of this new API from ads/tracking lib parterners, as not many users knows how to properly set that up for 3p iframes.

Is it possible to set this to * by default?

noamr commented 5 months ago

https://github.com/WICG/pending-beacon/issues/87#issuecomment-1985358609

The default permission policy is anyway not something in the spec, but in most/all cases it should be self.

@arturjanc @yoavweiss @noamr We are concerned that setting the default deferred-fetch permissions policy to self may significantly reduce the usage of this new API from ads/tracking lib parterners, as not many users knows how to properly set that up for 3p iframes.

WICG / pending-beacon

Request Limit #87