Closed fergald closed 11 months ago
/cc @nicjansma @cliffcrocker @andydavies @philipwalton
dealing with rejected calls to setData is tricky, especially if multiple calls are in flight, full example in the explainer.
The call to
setData
does not block and so there may be multiple outstanding calls tosetData
now theircatch
code has to be coordinated so that only one replacement beacon is created and the latest data is set on the beacon (and setting that latest data will be async and subject to the same problems).
I still don't understand this. Are you thinking about different parts of the application access the same beacon simultaneously? If so, that sounds like an application problem rather than an API problem. PendingBeacon is a relatively precious resource, and if there are much more (say, 1M) requests than beacons, the application needs to deal with the problem by queueing/filtering/merging requests for example.
Since the answer is long and detailed and might generate more long and details answers I've replied to @yutakahirano in a new issue (#10). I'd like to leave this issue for user feedback on the API.
I've given this a fair amount of thought, and generally speaking I think there are two primary use cases for this API, and to answer the question posed in this issue, it's worth evaluating how well the options address each of these use cases:
(There may be other use cases I'm not thinking about, please respond if so.)
For the "saving user state" case, the goal would be for an app to be able to restore the user's last state the next time they visit. This is often done via client-side storage, but there could be cases where an app wants to preserve this state across devices or browsers for a given user.
For this use case, the "Replace data" high-level API works well, as you only ever want to send the most up-to-date user data; it never matters what the previous state is. Also in the rare case where the state data fails to be sent, it's only a minor inconvenience to the user, so it's well suited for this API.
That being said, the ability to overwrite data without ever having to check isPending
and recreate the Beacon()
is only slightly more ergonomic than doing that manually.
Most analytics providers operate using an event model, where consumers of their product can report events based on user interactions (or other triggers) and then the analytics code manages sending that data to their backend servers.
Because analytics providers tend to have lots of customers and those customers often have lots of users, the amount of data sent to their backends can be large, so anything that limits both the number of events sent as well as the size of those events can have a big impact.
For this use case, I do not think the "Replace data" high-level APIs work well because there's a tradeoff where you gain API ergonomic convenience at the expense of sometimes over-sending data or sending data more frequently than you would otherwise need.
Let me outline the scenario where that would happen:
This is not ideal as it means that the backend now has to add logic to dedupe these events, and if you're an analytics provider servicing millions or billions of user visits, that can be a ton of processing cost.
The "Append data" high-level API would work well for some analytics use cases (e.g. if the event data never changes), but for use cases like RUM analytics where a performance metric value does change as the user is interacting with the page, the "Append data" API is also not ideal because (again) you'd have to write logic on your backend to dedupe that data, which can be expensive.
The "Low-level" sync API handles both of these cases nicely because it can make the determination for itself whether to add new data or replace existing data, and if it needs to replace data it can use the isPending
flag to determine how to replace that data to minimize what is sent.
This could reduce network traffic (although arguably connection reuse and header compression makes that a small benefit).
I think this argument is valid for an individual user (e.g. the difference will have little impact on them, their experience, or their data usage), but I don't think it's valid for an analytics provider who has to pay the network and processing cost of every beacon it receives.
Appreciate you soliciting feedback on this @fergald, and I agree with @philipwalton's assessment.
Thinking about this question from the perspective of a RUM analytics provider, the low-level sync API feels the most flexible and natural.
In our RUM processing pipeline, the browser (e.g. via our boomerang.js RUM library) will send 1-n RUM beacons to the backend, generally aligned with major "events" that occur from the user. We always want to capture the Page Load's data, but will also send beacons for subsequent in-page interactions, SPA soft navigations, significant errors on the page, etc. We aim to keep each of those beacons as small as possible and the beacon payload "localized" to the most recent event, meaning its payload has data for the current event backwards up to the most recent beacon. In other words, after a beacon goes out, we start with "fresh data" for the next beacon.
This allows our back-end processing pipeline to analyze individual beacons without needing to restore or save context from other possible (but maybe zero) other beacons. The data that is duplicated on each beacon is general dimensional data (e.g. what the browser is, location, etc). Timers, Metrics, and Log-style data are generally limited to the most-recent-thing being measured.
As you and @philipwalton mention above, RUM can measure specific data points (that may still change over time), as well as events/logs (that can grow in number of entries over time). Some practical examples of both:
Data points (that may change over time)
Events/logs (that grow in entries over time)
For both of these, being able to replace the current pending beacons' data is critical for us.
The way I'm envisioning boomerang.js using this API has been along the lines of the proposed low-level sync API. We'd like to continue sending beacons at our existing schedule of major events (Page Load, SPA Soft Nav, etc) with the ability to still "queue" data (for the most recent event) in case the page unloads itself.
For example, on a classic MPA (non-SPA) site:
PendingBeacon()
and our own JavaScript var beaconData
with skeleton data so something gets sent even if an abandon happens (e.g. dimensional data plus Page URL)
pageHideTimeout
to ~1 minute or soonload
we may queue additional data like the Page Load Timers, FCP, etc
beacon.setData(beaconData);
calls after updating the beaconData
objectbeacon.setData(beaconData);
calls .sendNow()
by, say 5 minutes, so if the user keeps their browser open for 6 hours and doesn't interact with it, we can still give our customers a "real time" view of their visitors logging those interactions within 5 minutesPendingBeacon
payload
pagehide
/ BFCache restore, we'd probably flush the last beacon with .sendNow()
and start a new PendingBeacon
for whatever happens nextIf you add a SPA site into the mix,
.sendNow()
to flush out the last event, re-skeletonize our var beaconData
object, and create a new PendingBeacon
that will track the SPA route dataGiven our use case, the low-level sync API seems the most natural.
For the high-level APIs, I think ReplaceableBeacon
would satisfy our needs? We would .replaceData(data)
and .sendNow()
as needed?
I have a couple questions, to make sure I understand though:
.isPending
state, but I'm actually struggling to think of a case where we'd ever check .isPending
. isPending
would only happen "unexpectedly" (outside of a .sendNow()
) after a pagehide
, right? I think for our logic, we're always either forcing a send (due to a SPA navigation, restore, etc) or waiting for it to be sent after unload, I don't think we'd need to check .isPending
anywhere... is that right? In Philip's step 4 of coming back from a restore, I think we'd just .sendNow()
and start fresh regardless.AppendableBeacon
, what does appendData(data)
do? If data
is an object, is it merging in the properties of the last call to appendData(data)
with the new properties of this data
?@nicjansma Thanks for the detailed feedback.
For your questions
isPending()
after coming back from BFCache, to check whether the pageHideTimeout
kicked in or not and whether you need to start a new beacon or just replace.data
s together somehow to create a single final data in some encoding (for a POST beacon maybe it could be multipart form encoding).The thing that would make the replace/append API insufficient is if you there are cases where you would replace only some of the payload with an update, e.g. you'd have a single beacon that was carrying Page Load Timers, LCP and CLS all in one beacon because then you would need to know whether something had already been sent, e.g. you don't want to send the PLT a second time if it's already been sent once. To use replace/append for that you'd probably want to put them all on different beacons, PLT only being set once but LCP and CLS would just keep getting updated and now you have 3 beacons instead of one.
@nicjansma
- It seems like a lot of the tradeoffs mentioned above by both Fergal and Philip are around needing to see the
.isPending
state, but I'm actually struggling to think of a case where we'd ever check.isPending
.
It sounds like the main reason you don't see a use for isPending
is that you plan to always send after a bfcache restore. However, if the API were to change from pageHideTimeout
to a more general background timeout (as discussed here) then I image you would want to check isPending
and not send the queued data if you didn't need to, correct?
@philipwalton apologies for the late reply, but that sounds correct!
The API shape has been evolved into fetchLater()
API after #70 and https://github.com/whatwg/fetch/pull/1647. Close it for now.
We are considering 3 different APIs and would appreciate feedback. I will include some pros and cons but in order for us to correctly weight these, please comment even if it has already been called out as a pro/con and it's important to you.
This issue focuses on how to set new data on the beacon and deal with a beacon that has already sent its data.
Low-level APIs
These are 2 versions of the API in the explainer. Sync vs async is about the API shape, a sync API does not mean that operations will block, waiting for external events, rather it means that the API does not use Promises and that state cannot spontaneously change mid-task.
Sync
We have
setData
andisPending
and ifisPending
returnstrue
then the beacon has not been sent yet andsetData
will succeed. We could also removeisPending
and havesetData
throw an exception but that's not fundamentally different.Pros
Cons
Async
We have only
setData
and noisPending
.setData
returns a Promise that will resolve if the beacon has not been sent yet and the data was successfully set. It will reject if the data could not be set.The reason we drop
isPending
is because the result could be invalid by the time we try to act on it.Pros
Cons
High-level API
There are two straight-forward use cases for the beacon that suggest higher level APIs. Both of these could be implemented using the lower level API above. The real question is whether these 2 high-level APIs are enough or do we need to expose the low-level API?
In both of these, there is no
isPending
or even a way to tell if data has been sent already.Appending data
The beacon accumulates data and batches it up for sending. Policies like timeouts etc control how batching occurs (some data may be sent before the page is discarded). It guarantees (to the extent possible) that all data appended will eventually be sent.
The page never needs to check if the beacon has already sent some intermediate batch, it just keeps appending data.
Replacing data
The beacon's data is replaced by calls to
setData
. It doesn't matter whether the beacon has already sent data, it can always be replaced. Again, policies like timeouts etc control when sending occurs with a guarantee that the last set value will be definitely be sent.Use case, e.g. reporting LCP values. The page just keeps setting the latest observed LCP, perhaps with a policy that says "don't leave data sitting around unsent for more than 5 minutes".
Discussion
An example of where these APIs might not work well is where the page would like to merge 2 metrics into 1 beacon if possible. With the low-level API, it would check if the beacon has been sent already and if not, replace the data with the combined data. This could reduce network traffic (although arguably connection reuse and header compression makes that a small benefit). It could also reduce processing cost by delivering related data already joined.
It may be that these APIs are capable of doing everything that's needed but impose costs on the backend.
It may also be that there are use-cases that simply cannot be met with these APIs.
Please let us know.