Early Design Review: Pending Beacon API

mingyc commented 1 year ago

Wotcher TAG!

I'm requesting a TAG review of Pending API.

This is a proposal for Pending Beacon API. It is a system for sending beacons when pages are discarded, that uses a stateful API rather than having developers explicitly send beacons themselves.

Explainer¹ (minimally containing user needs and example code): https://github.com/WICG/pending-beacon/blob/main/README.md
User research: currently only feedback from RUM providers
Security and Privacy self-review²: https://github.com/WICG/pending-beacon/blob/main/security-privacy-questionnaire.md
GitHub repo (if you prefer feedback filed there): https://github.com/WICG/pending-beacon
Primary contacts (and their relationship to the specification):
- Ming-Ying Chung (mingyc), Google - designing/implementing in Chrome
- Ian Clelland (clelland), Google - specifying in Chrome
Organization/project driving the design: Google (BFCache team)
External status/issue trackers for this feature (publicly visible, e.g. Chrome Status): https://chromestatus.com/feature/5690553554436096

Further details:

[✓] I have reviewed the TAG's Web Platform Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): WICG
The group where standardization of this work is intended to be done ("unknown" if not known): WHATWG
Existing major pieces of multi-stakeholder review or discussion of this design: https://w3c.github.io/web-performance/meetings/2022/2022-03-31/index.html
Major unresolved issues with or opposition to this design: https://github.com/WICG/pending-beacon/issues
This work is being funded by: Google

You should also know that...

[please tell us anything you think is relevant to this review]

We'd prefer the TAG provide feedback as (please delete all but the desired option):

💬 leave review feedback as a comment in this issue and @-notify @mingyc and @fergald

mingyc commented 1 year ago

Friendly ping on this request as it has been a while.

ylafon commented 1 year ago

Hi, @maxpassion and I reviewed this during our F2F and we have a few questions:

About the use of Get/PostPendingBeacon, why not using the method in a dictionary instead of having two instances? How about ensuring that a PostPB isn't sent twice (like if there is a crash). Issues about crash recovery, should it be done after a network and/or location change, as it can have privacy implications? after a time limit? Do having a pending beacon registered impact how bfcache work?

What is the point of forcing a PendingBeacon to be sent, if it is supposed to be sent at the end of the interaction, why not using fetch or a normal beacon in that case?

Why existing mechanism is not reliable and how this new proposed mechnisam solve the not reliable issue? Should the HTML unload event be fixed instead? Could this be solved at the Page Lifecycle level instead of having a new specification?

Thanks,

mingyc commented 1 year ago

About the use of Get/PostPendingBeacon, why not using the method in a dictionary instead of having two instances?

The proposal to have two different classes PendingGetBeacon & PendingPostBeacon is to help enforcing type safety. There was an explanation https://github.com/WICG/pending-beacon/issues/59#issuecomment-1325977557.

How about ensuring that a PostPB isn't sent twice (like if there is a crash).

A browser should ensure that a queued PendingBeacon can only be sent once. At the time of crashing, browser should know that whether a given PendingBeacon is sent or not. But it cannot guarantee the "send" action succeed. => This behavior needs to be addressed by maybe a retry mechanism.

Issues about crash recovery, should it be done after a network and/or location change, as it can have privacy implications? after a time limit?

Crash recovery is still under discussion. But we think that network status after a crash does not matter, as developers can already achieve similar effect using local DB, e.g. storing data manually and recover from indexDB after a crash.

Do having a pending beacon registered impact how bfcache work?

Nope. One of the goals of this proposal is to provide a reliable API that developers can use along with bfcache.

What is the point of forcing a PendingBeacon to be sent, if it is supposed to be sent at the end of the interaction, why not using fetch or a normal beacon in that case? Why existing mechanism is not reliable and how this new proposed mechnisam solve the not reliable issue?

Please see this section discussing about an alternative approach Extending Fetch API and this comment arguing that the goal of this proposal is beyond the scope of fetch() API.

Should the HTML unload event be fixed instead? Could this be solved at the Page Lifecycle level instead of having a new specification?

We don't think so. This section from the explainer describes that the page lifecycle events are not sufficient.

Please let us know if you have any other questions. Thanks for your time!

hober commented 1 year ago

torgo commented 1 year ago

One thing we noticed in reviewing the explainer is a lack of user needs. Could you spend a few minutes writing at the beginning how this benefits the user? This is part of our explainer explainer and is intended to get people thinking in terms of user benefit. Thanks! https://tag.w3.org/explainers/

ylafon commented 1 year ago

Looking again at the spec, the goal is to send data/state to a server using HTTP POST... and also GET?

mingyc commented 1 year ago

One thing we noticed in reviewing the explainer is a lack of user needs.

Our goal is to provide web developers with a reliable ‘beaconing’ API.

Looking again at the spec, the goal is to send data/state to a server using HTTP POST... and also GET?

Yes, as stated in the Motivation section, more accurately it is sending a bundle of data to a backend server, without expecting a particular response, ideally at the ‘end’ of a user’s visit to a page.

miketaylr commented 1 year ago

It's a bit hard to be sure, but after reading https://github.com/WICG/pending-beacon/blob/main/README.md#open-discussions and beyond, it seems that sending data/state is only supported for POST, and not GET. Is that correct?

mingyc commented 1 year ago

Ah sorry nope. Our initial intention is to support both POST & GET. But some ongoing discussions in https://github.com/WICG/pending-beacon/issues/72 suggests us to support a general fetch() request, which includes other HTTP methods.

hober commented 1 year ago

It seems like some major changes are in the works (see https://github.com/WICG/pending-beacon/issues/70 and related issues) so I'm not sure if there's much value in us reviewing what will likely soon drastically change.

fergald commented 1 year ago

The changes are all in the API (completely new API vs integrating with fetch). I think the behaviour and goals are unchanged and it would be helpful to review those if you have time.

hober commented 1 year ago

https://github.com/WICG/pending-beacon/issues/70#issuecomment-1525821663

npdoty commented 1 year ago

Just a +1 for including some explanation of how or why this would benefit users.

It seems like the implicit reasoning here (as with previous Beacon APIs) is that websites want to send potentially large amounts of data particularly when a user navigates away from a page. Doing this in the background perhaps makes current abusive behavior (attempts to block navigation altogether until the data is sent) less annoying. But as a side effect, it means that the network traffic happens when the user is doing something else. The page that collected and sent the traffic is more opaque to the user, and any slowdowns that happen won't be attributed back to the page causing them, so more pages may feel comfortable collecting and sending more data.

fergald commented 1 year ago

@npdoty

Just a +1 for including some explanation of how or why this would benefit users.

It seems like the implicit reasoning here (as with previous Beacon APIs) is that websites want to send potentially large amounts of data particularly when a user navigates away from a page. Doing this in the background perhaps makes current abusive behavior (attempts to block navigation altogether until the data is sent) less annoying. But as a side effect, it means that the network traffic happens when the user is doing something else. The page that collected and sent the traffic is more opaque to the user, and any slowdowns that happen won't be attributed back to the page causing them, so more pages may feel comfortable collecting and sending more data.

The criticisms apply to keep-alive fetch and somewhat to sendBeacon.

sendBeacon is explicltly specced not to contend for network or CPU with the next page-load, pending beacon should do something similar, although I don't think we have called that out in the explainer. It is also allowed to delay sending while the page is visible, to save network/battery.

They both impose a quota on how much data there can be outstanding. The quota is shared with the pending beacons too. Discussion is still ongoing on how exactly to make that work but this should be a neutral change in that regard.

keep-alive fetch is eager when called with the request starting immediately.

keep-alive fetch must read the entire response. We don't expect beacon responses to be large (since they cannot be accessed from JS) but if they are, there may be an occasional win there in not reading them.

Pending beacon allows updating the payload (most likely cancel and replace) this means they will send less data overall than sendBeacon.

Also, vs keep-alive fetch, knowing that the request is a beacon, not requiring a response means that browsers have flexibility on how/when to send these. I think we'd want to see it being a real problem before trying any of these but browsers could delay sending beacons for a few seconds if another page has just started loading or just send them rate-limited. I would expect that responses will be minimal since they cannot be accessed from JS but there may also be something to be gained not reading responses past the headers and closing the connection when the payload is confirmed accepted. The same argument is a positive for sendBeacon.

We should add this to the explainer. @mingyc

hober commented 1 year ago

https://github.com/whatwg/fetch/pull/1647

cynthia commented 12 months ago

We're assuming deferred fetch as proposed in https://github.com/whatwg/fetch/pull/1647 will supersede this, should this be closed? @mingyc @fergald

mingyc commented 12 months ago

In that case, should deferred fetch be reviewed here (or in another new issue)?

cynthia commented 11 months ago

For this case I might recommend a new issue.

mingyc commented 11 months ago

Filed a Specification Review https://github.com/w3ctag/design-reviews/issues/887 as I think we have passed the early design stage.

martinthomson commented 6 months ago

See #887 for the review of FetchLater, which is the successor of this.

w3ctag / design-reviews

Early Design Review: Pending Beacon API #776