WICG / pending-beacon

A better beaconing API
Other
43 stars 8 forks source link

limits on number and size of pending beacons #66

Closed fergald closed 9 months ago

fergald commented 1 year ago

In this issue @marcoscaceres brought up the issue of limits on the size and number of beacons that can be pending

- The specification should include a size limit on the data to be transmitted (it's eluded to already, but something we should concretely work out). - The specification should also impose a reasonable limit on the number of pending beacons, such as to 1 or 2.

Data limits

sendBeacon and keep-alive fetches are currently limited to a total of 64k bytes of outstanding data.

It seems like Pending Beacon should also interact with this limit. There is a distinction between in-flight and pending so we have a choice to make about whether to just add them all together or to treat pending and in-flight differently.

This also implies that we need a way to signal that a request to create a beacon has been rejected.

It also implies that setting a new payload could be rejected.

Instance limits

Currently, I believe there are no specced limits on the number of outstanding keep-alive requests that a page may have ongoing.

Also it will be difficult/impossible for a page to manage keeping the number below some limit when beacons are likely to be created by 3rd party scripts (e.g. RUM providers, analytics, ads).

So while there may be a need to limit the amount of data that will suddenly appear on the network at end of page, I want to be careful about overly strict limits.

@marcoscaceres could you elaborate on the motivation for the limits you suggested?

General

Any limits we impose that make the API unattractive/difficult will block adoption and leave pages using existing APIs. Existing APIs force devs to send redundant data e.g. on every pagehide or every time visibility becomes "hidden"`. Sending data gradually over the lifetime of the page will avoid running into the keep-alive in-flight limit but could result in sending far more data overall.

One strategy could be that a request to create a new beacon that would breach the limit, causes older beacons to be sent sooner than they otherwise would have, making space for the new one.

marcoscaceres commented 1 year ago

@marcoscaceres could you elaborate on the motivation for the limits you suggested?

even though each beacon is restricted to 64kb, one could just create a thousand pending beacon instances and chunk all the data, right?

the purpose of the limit is to provide the flexibility sites need, but not leave it open to abuse by every script on a page or iframe.

Beacons should only be used on very special occasions by privileged actors. Otherwise sites should just use fetch, no?

fergald commented 1 year ago

I might be misreading it in the section 8.10 about inflightKeepaliveBytes, it says that the cumulative total for all outstanding keep-alive requests in the fetch group is 64kb.

Beacons should only be used on very special occasions by privileged actors.

I don't see why this would be true. Beacons could be used by anyone who has data where they

Additionally, there is the facility to update the data continuously and know that only the final value will be sent e.g. for Core Web Vitals metrics, like cumulative layout shift, a beacon would be created on page load, it would be updated every time there is some layout shift and eventually when the user navigates away (or some timeout expires), the last value is sent. Another example is impression data for ads (number of seconds on-screen).

yoavweiss commented 1 year ago

@marcoscaceres it may help to clarify what the threat model for abuse you are concerned with.

marcoscaceres commented 1 year ago

Admittedly, I'm (at least for now) not the right person to be responding here as I haven't actually read that part of the fetch spec and have limited knowledge of Beacon 😬 I'm clearly making incorrect assumptions based on the internal feedback I used to put together the WebKit position - my various question marks above were sincere.

I'm meeting with Anne and Alex next week to discuss concerns and respond properly. One of us will be point on this in the coming week(s): hopefully with clarifications, detailed outline of any concerns, and a concrete proposal for moving forward.

mingyc commented 9 months ago

See the algorithm to handle totalScheduledDeferredBytesForOrigin in the Deferred fetching section of fetchLater() spec.