WICG / pending-beacon

A better beaconing API
Other
43 stars 8 forks source link

Fetch-based design #70

Open achristensen07 opened 1 year ago

achristensen07 commented 1 year ago

The design presented today was better than the get-and-post-only design previously proposed, but still not quite as good as I think it can be. I think we still need to think less about designing around the current use case and ask ourselves "what is the minimal addition to fetch that would allow me to do what I need to do?" and I think the answer is just some kind of signal to send now and two new values in the RequestInit dictionary: background timeout seconds and a boolean value saying this is deferred until unload. If you want to change the URL or content, you abort the fetch and start a new one.

mingyc commented 1 year ago

What you suggest is similar to an alternative approach we previously proposed: adding deferSend: true into RequestInit.

I think this is a question about to what level of abstraction we should provide. One reason for this previous proposal (PendingBeacon) is to encapsulate the best practics around sending beacons, that the browser takes care of filling up the irrelevant details in Request and also takes care of sending at destroying of document. We also want to preserve this in the fetch-based design.

noamr commented 1 year ago

Yea I think the alternative approach works better, with handling background-timing and the rest of the semantics (including the visibility-related things) in userland. deferSend or so that works only on document destruction and aborting/reinstating when you want to update the contents or do something special on visibility/timeout does the job.

We can provide a "best practices" pure JS wrapper that looks like the PendingRequest proposal if it's necessary, but I sense that this API will be used mainly by RUM providers/CMS/more advanced users at least at the start so perhaps that can wait.

yoavweiss commented 1 year ago

One more question to answer is: how can developers know that a pending fetch was or wasn't yet sent, when they have more data to potentially append to it?

What we could do is resolve the Fetch promise when the request is starting to be sent (rather than when a response arrives), and resolve it without a response. There may be better alternatives to get that signal.

Another question would be around the ergonomics of aborting a fetch() every time the data is updated. It may be interesting to explore code samples to that effect.

noamr commented 1 year ago

Ergonomics-wise I think it's fairly simple but let's see what we come up with. What we do need though is some additional options for the backgrounded timeout for the bfcache case (as you lose your JS context at that point).

Is the response promise not enough for understanding if your beacon was sent? Isn't it actually better, as in, if there was a network issue or 500 error or what not you can still re-send it?

If we do need a resolve-on-request, I think this should have its own call (or sendBeacon arg) rather than fetch with an extra parameter because it changes the semantics of fetch's return value.

fergald commented 1 year ago

@yoavweiss

What we could do is resolve the Fetch promise when the request is starting to be sent (rather than when a response arrives), and resolve it without a response. There may be better alternatives to get that signal.

So something like

let hasBeenSent = false;
let abort = new AbortController(); 
fetch(...).then(() => {hasBeenSent = true;};

function updateData(data)
  if (!hasBeenSent) {
    abort.abort();
  }
  hasBeenSent = false;
  fetch(...).then(() => {hasBeenSent = true;};
}

We have to worry about the timing of microtasks. Thinking about Chrome's impl, as long as the browser sending is blocked on "go ahead" from the renderer (when it's still alive) then we can post that microtask when the renderer gives that "go ahead". As long as it runs before any other JS, we're OK. If some any other JS could sneak in before it runs then it will see hasBeenSent == false but actually it has been sent.

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

noamr commented 1 year ago

Imagining userland code that uses it would look something like this:

class UserlandPendingBeacon {
  #abortController = null;

  // Including the {defer} structure, timeouts etc
  #requestInfo = null;

  // Let's imagine it's a list of plain objects.
  #pendingEntries = [];

  constructor(requestInfo, initialEntries) {
    this.#requestInfo = requestInfo;
    this.#pendingEntries = initialEntries;
    this.#send();
  }

  async #send() {
    if (this.#abortController) {
      this.#abortController.abort();
    }

    this.#abortController = new AbortController();
    const sentSignal = new SentSignal();
    fetch(this.#requestInfo, {signal: this.#abortController.signal, sentSignal, body: JSON.stringify(this.#entries)});
    sentSignal.addEventListener("sent", () => {
      this.#pendingEntries = [];
      this.#abortController = null;
    });
  }

  // This will abort an ongoing fetch, and will schedule a new deferred fetch
  append(entry) { 
    this.#pendingEntries.push(entry);
    this.#send();
  }
}
noamr commented 1 year ago

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

Posted microtasks always run before queued tasks.

yoavweiss commented 1 year ago

Is the response promise not enough for understanding if your beacon was sent? Isn't it actually better, as in, if there was a network issue or 500 error or what not you can still re-send it?

Waiting for the response promise to resolve would be racy, and can lead to wasted client and server resources.

A few constraints that the original design had in mind:

If we waited for the response, we run a risk of aborting an in-flight request, either wasting resources if it waas terminated halfway through, or reporting the same data twice, if it we aborted after the response was sent from the server.

If we do need a resolve-on-request, I think this should have its own call (or sendBeacon arg) rather than fetch with an extra parameter because it changes the semantics of fetch's return value.

I think this was one of the considerations that lead to the PendingBeacon design. An alternative is to have the fetch() Promise never resolve, and have a separate signal for "this request started sending, so too late to update it".

noamr commented 1 year ago

I think this was one of the considerations that lead to the PendingBeacon design. An alternative is to have the fetch() Promise never resolve, and have a separate signal for "this request started sending, so too late to update it".

... or it can still resolve if there was a response. why not? See https://github.com/WICG/pending-beacon/issues/70#issuecomment-1473367497 for a sample implementation with a "sent" signal.

yoavweiss commented 1 year ago

See #70 (comment) for a sample implementation with a "sent" signal.

Nice! (although, if we're using an event there may be a task queueing race? If so, being able to sync check SentSignal would solve that, I think)

noamr commented 1 year ago

See #70 (comment) for a sample implementation with a "sent" signal.

Nice! (although, if we're using an event there may be a task queueing race? If so, being able to sync check SentSignal would solve that, I think)

Event doesn't imply anything about tasks. But also signal.sent is fine

yoavweiss commented 1 year ago

Event doesn't imply anything about tasks.

That's fair..

fergald commented 1 year ago

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

Posted microtasks always run before queued tasks.

Microtasks posted from a JS task will run before other queued JS tasks but sending of a beacon in Chrome is an IPC from the browser to the renderer to say "hey I want to send this, if it's not already cancelled, mark it sent and tell me I can go ahead". So I guess as long as that posts an empty JS tasks that then runs this microtask and as long as that whole thing happens before any other JS tasks that happened to be enqueued already, we're ok. I don't know if this is simply an implementation problem or if it is impossible spec-wise to say "here's a micro task, it must jump the queue on anything else that's already queued". I don't think the spec has any concept of non-JS tasks, so there is no JS task for this microtask to run after.

noamr commented 1 year ago

You can't "queue a microtask" via IPC. It would have to be a regular task that resolves a promise. If it's the browser's decision when to send the beacon in practice, I think sentSignal would always be somewhat racy.

fergald commented 1 year ago

OK, so then implementation-wise, in that IPC we would want to synchronously run an empty JS task with this microtask. This seems doable at least.

mingyc commented 1 year ago

Offline discussion today:

PendingBeacon Size Limit

Fetch Spec enforces a 64KB size limit for keepalive request per fetch group. And PendingBeacon underlying is a keepalive fetch.

Scenario 1: if user makes a X (>64KB) size PendingBeacon request, it should fail, throwing network error. User should switch to non-keepalive fetch instead.

Scenario 2: if user makes a X (<=64KB) size PendingBeacon request, it should be queued to send. If after queuing, the total excceeds the 64KB limit, the browser flushes the queues earlier than the time of page destruction.

Can enabling BackgroundFetch permission allows to bypass this limit?

SendSignal

Proposed as boolean pending in PendingBeacon.

The discussions were around whether to bundle both AbortSignal and SendSignal together, or let users handle them by themselves.

To prevent from race condition, the renderer should be authoritative to the request's send state when it's alive. (Also discussed before for PendingBeacon).

fetch Promise

Need more user input.

To maintain the same semantic, browser should resolve Promise when the request is sent. But it may introduce unnecessary belief to the Promise for this usage: in reality PendingBeacon may never resolve.

backgroundTimeout

Need more user input.

Lifetime of PendingBeacon

Previously discussed in https://github.com/WICG/pending-beacon/issues/30 that a request can be kept if another same partition page is open.

noamr commented 1 year ago

Summarizing an offline conversation with @yoavweiss from today: Perhaps instead of sentSignal, we could have a quota of X bytes per Y fetch origins, e.g. 16kb 4 origins or so. If we do that, and an origin exceeds its own quota, it would get a network error (like in keepalive). This will eliminate the complex issue of deferred beacons sometimes* being sent while the environment is still alive. With origin-based quotas, the beacon is always sent after the environment is gone, and updating/cancelling it is up to userland.

In this case, all that's needed in fetch is the abililty to defer, and some rules (e.g. backgroundedTimeout). WDYT?

yoavweiss commented 1 year ago

Perhaps instead of sentSignal, we could have a quota of X bytes per Y fetch origins, e.g. 16kb * 4 origins or so.

Just noting that we didn't discuss specific values. IMO, those values are too low for practical use. They also make me wonder RE building in compression here. Could we pass CompressionStreams here as the request body? If so, would the quota be post-compression?

noamr commented 1 year ago

Perhaps instead of sentSignal, we could have a quota of X bytes per Y fetch origins, e.g. 16kb * 4 origins or so.

Just noting that we didn't discuss specific values. IMO, those values are too low for practical use. They also make me wonder RE building in compression here. Could we pass CompressionStreams here as the request body? If so, would the quota be post-compression?

Re compression streams, I don't see why not. What would be practical values here?

yoavweiss commented 1 year ago

What would be practical values here?

It might be interesting to ask @nicjansma for real life bytesizes of e.g. beaconing ResourceTiming entries. I would imagine a 64K limit per origin to be more reasonable.

noamr commented 1 year ago

AFAIC those constraints can also be implementation-specific, with perhaps a 64kb-per-origin as a max.

yoavweiss commented 1 year ago

We could also count how many different origins use sendBeacon or fetch keepalive, and use that data to inform the number of origins that can do that.

AFAIC those constraints can also be implementation-specific, with perhaps a 64kb-per-origin as a max.

If we're making them implementation specific, why do we need to specify a max?

noamr commented 1 year ago

We could also count how many different origins use sendBeacon or fetch keepalive, and use that data to inform the number of origins that can do that.

All of these options sound reasonable to me.

AFAIC those constraints can also be implementation-specific, with perhaps a 64kb-per-origin as a max.

If we're making them implementation specific, why do we need to specify a max?

Not strictly necessary, but to have some consistency.

noamr commented 1 year ago

OK since knowing the state is necessary also after restoring from bfcache, we still need something like a sentSignal. However, I wonder if it could be a simple state getter rather than a promise. The idea would be to have a stateful class DeferredFetch that holds the defer-specific params, and also a getter that lets you know whether the deferred fetch was sent.

A userland stateful beacon would look like this:

class UserlandPendingBeacon {
  #abortController = null;

  // Let's imagine it's a list of plain objects.
  #pendingEntries = [];

  // Can also save headers or whatever other fetch params.
  #url;
  #deferParams;

  constructor(url, deferParams) {
    this.#url = url;
    this.#deferParams = deferParams;
    this.#pendingEntries = [];
  }

  // This will abort an ongoing fetch, and will schedule a new deferred fetch
  async send(entries) { 
    // previous entries were sent, we don't have to resend them
    if (!this.#deferredFetch?.state !== "pending")
      this.#pendingEntries = [];
   else if (this.#abortController)
      this.#abortController.abort();

    this.#pendingEntries = [...this.#pendingEntries, ...entries];

    this.#abortController = new AbortController();
    this.#deferredFetch = new DeferredFetch(this.#deferParams);
    fetch(url, {signal: this.#abortController.signal, defer: this.#deferredFetch, body: JSON.stringify(this.#entries)});
  }
}
mingyc commented 1 year ago

Thanks for your updates!

One comment about the state getter, i.e. deferredFetch.state: we have previously discuss about whether to support general state property for PendingBeacon, and the conclusion is that general purpose "state" getter is problematic in terms of extensibility - once it is shipped, it will be difficult to add a new state without breaking existing applications. Plus, what we only really need to support is to tell whether the beacon is updatable, i.e. whether it's pending, which leads to PendingBeacon.pending.

Gvien that, can we concluse the new fetch option is

interface DeferredFetch {
  readonly bool pending;
  long backgroundTimeout;  // or a more lousy name `sendAfterBeingBackgroundedTimeout`? #73 
}
noamr commented 1 year ago

Thanks for your updates!

One comment about the state getter, i.e. deferredFetch.state: we have previously discuss about whether to support general state property for PendingBeacon, and the conclusion is that general purpose "state" getter is problematic in terms of extensibility - once it is shipped, it will be difficult to add a new state without breaking existing applications. Plus, what we only really need to support is to tell whether the beacon is updatable, i.e. whether it's pending, which leads to PendingBeacon.pending.

Gvien that, can we concluse the new fetch option is

interface DeferredFetch {
  readonly bool pending;
  long backgroundTimeout;  // or a more lousy name `sendAfterBeingBackgroundedTimeout`? #73 
}

I think I prefer "sent" to "pending" but either works for me.

noamr commented 1 year ago

Thinking about this, I think we can propose two options that are similar. They both include a DeferredRequest interface (name bikesheddable):

We can do one of the following:

  1. Add an attribute to Request with a live object, similar to this. If we do that, though, we need to have some signal for quota-exceeded, e.g. have DeferredFetch be an EventTarget or AbortSignal-like semantics with a Promise.
  2. Have a separate function call like sendBeacon, but that accepts a Request (or RequestInit) object and a backgroundTimeout separately, and succeeds when the request is sent and fails if quota is reached. deferredOptions)`

The reason I prefer (2) rather than (1) is that (1) changes the semantics of fetch in terms of return value - you don't expect a deferred fetch to return a response promise. This is different from the keepalive case. By using a Request object in a sendBeacon-ish function we still align with any future extensibility we need from fetch as fetch features are usually added to Request rather than as an additional fetch parameter.

But we could go with one and have a live DeferredRequest object if fetch-extensibility means using fetch directly.

@annevk @mingyc wdyt?

mingyc commented 1 year ago

Just note that there are some disccussions about reusing sendBeacon() here.

noamr commented 1 year ago

Just note that there are some disccussions about reusing sendBeacon() here.

Yea I'm referring to it here. I wonder if Request is extensible enough or if using fetch explicitly is needed, which would require using some sort of signal-like object for checking if quota was exceeded.

mingyc commented 1 year ago

@annevk @marcoscaceres PTAL, do you have any comments on the API shapes? (see https://github.com/WICG/pending-beacon/issues/70#issuecomment-1506373481)

annevk commented 1 year ago

Why is aborting a fetch no longer a viable strategy?

I'm not sure I understand some of the discussion. Presumably you'd never resolve the fetch() promise as it would never be transmitted during the document lifetime. And aborting and scheduling a new fetch happen synchronously so you can be sure that will have worked.

noamr commented 1 year ago

Why is aborting a fetch no longer a viable strategy?

I'm not sure I understand some of the discussion. Presumably you'd never resolve the fetch() promise as it would never be transmitted during the document lifetime. And aborting and scheduling a new fetch happen synchronously so you can be sure that will have worked.

The premise was that fetch can be transmitted in the backgrounding case, and the promise would resolve when you're back from BFCache-restore. But if we switch to a per-origin quota, we might be able to get away with using the regular fetch promise in the BFCache-case: the promise would only resolve in BFCache-restore, which would be very unlikely to race.

The downside of this would be if we would have other scenarios in the future that trigger transmission.

class UserlandPendingBeacon {
  #abortController = null;

  // Let's imagine it's a list of plain objects.
  #pendingEntries = [];

  // URL, headers, defer options etc
  #requestInit;

  constructor(requestInit) {
    this.#requestInit = requestInit;
    this.#pendingEntries = [];
  }

  // This will abort an ongoing fetch, and will schedule a new deferred fetch
  async send(entries) { 
     if (this.#abortController)
        this.#abortController.abort();

    this.#pendingEntries = [...this.#pendingEntries, ...entries];

    this.#abortController = new AbortController();
    fetch({signal: this.#abortController.signal, body: JSON.stringify(this.#entries), ...this.#requestInit})
       .then(() => {
          // This would only be invoked in the backgrounding case, after bfcache restore.
          this.#abortController = null;
          this.#pendingEntries = [];
       });

  }
}
noamr commented 1 year ago

@mingyc ^^ ?

fergald commented 1 year ago

@annevk

Why is aborting a fetch no longer a viable strategy?

What do you mean by this, there is a parameter to take an AbortController signal.

mingyc commented 1 year ago

Why is aborting a fetch no longer a viable strategy?

See https://github.com/WICG/pending-beacon/issues/70#issuecomment-1496390109 and https://github.com/WICG/pending-beacon/issues/70#issuecomment-1498593728, we need to support more than just aborting, i.e. a pending state.

@noamr How does your latest proposal solve restoring from bfcache?

noamr commented 1 year ago

Why is aborting a fetch no longer a viable strategy?

See #70 (comment) and #70 (comment), we need to support more than just aborting, i.e. a pending state.

@noamr How does your latest proposal solve restoring from bfcache?

The idea is that you would wait for the fetch to return a response, ie the regular fetch promise. There could still be a race condition if the page is restored exactly at the moment while the fetch is in flight, but the proposal assumes that this would be quite rare. Note that we should spec & implement it in such a way that that promise is resolved before the pageshow event.

fergald commented 1 year ago

The idea is that you would wait for the fetch to return a response, ie the regular fetch promise. There could still be a race condition if the page is restored exactly at the moment while the fetch is in flight, but the proposal assumes that this would be quite rare. Note that we should spec & implement it in such a way that that promise is resolved before the pageshow event.

As long as we don't deliver any response content, we can eliminate the race (under the hood, in the same task that the renderer acks the browsers request to send, we resolve the promise). If you want to deliver the content with the promise resolution, then there is an indefinitely long window to race. It's not just in the BFCache case, it's also in the case where we flush due to quota. Once sending starts, you shouldn't cancel but if the promise needs the response then that is not available until the network is fully done.

The above implies to me that we have to choose between

noamr commented 1 year ago

The idea is that you would wait for the fetch to return a response, ie the regular fetch promise. There could still be a race condition if the page is restored exactly at the moment while the fetch is in flight, but the proposal assumes that this would be quite rare. Note that we should spec & implement it in such a way that that promise is resolved before the pageshow event.

As long as we don't deliver any response content, we can eliminate the race (under the hood, in the same task that the renderer acks the browsers request to send, we resolve the promise). If you want to deliver the content with the promise resolution, then there is an indefinitely long window to race. It's not just in the BFCache case, it's also in the case where we flush due to quota.

Assuming here that quota is per-origin, and exceeding quota would fail rather than flush. This would shorten the race window considerably.

Once sending starts, you shouldn't cancel but if the promise needs the response then that is not available until the network is fully done.

The above implies to me that we have to choose between

  • deliver response content in the promise

Pretty much proposing this. It's not a long race window if we eliminate auto-flushing. Perhaps it's even a negligible one.

annevk commented 1 year ago

My thinking was that if you pass the new-keepalive option to fetch it would never resolve. The browser would take a snapshot of its request when the document is no longer fully active and do something with it. This also makes "pending" not our concern, but more of a general document lifetime concern.

If we carefully define quota-related thingies it could perhaps reject early, but there's a lot of caveats here with documents of multiple origins.

noamr commented 1 year ago

My thinking was that if you pass the new-keepalive option to fetch it would never resolve. The browser would take a snapshot of its request when the document is no longer fully active and do something with it. This also makes "pending" not our concern, but more of a general document lifetime concern.

How would this work with BFCache restore and backgroundTimeout? You'd need to know somehow whether your fetch was transmitted due to that timeout.

annevk commented 1 year ago

Can't you learn from the server?

noamr commented 1 year ago

Can't you learn from the server?

As in, add something like SSE or send a GET fetch to find out? Sounds a bit awkward and even more racy... Not sure how that's preferable to resolving the promise on pageshow?

fergald commented 1 year ago

Can't you learn from the server?

For the RUM use-case (and ads metrics etc), these things tend to be write-only. There is no end-point that you can query to tell if a specific chunk of data has been received and building one would require indexing all the incoming data for serving and keeping that around for long enough. We should avoid that if possible.

annevk commented 1 year ago

In a large number of cases isn't that what you would have to do if you want to know the answer given OOM? This part of the feature would only be relevant if you successfully manage to get in the back-forward cache, which is optional anyway.

noamr commented 1 year ago

In a large number of cases isn't that what you would have to do if you want to know the answer given OOM?

Not sure how OOM is related? What you this tries to solve is to know whether the fetch was transmitted due to the backgrounding timeout. Yes, this is only relevant to being restored from BFCache. We only care about whether these fetches were transmitted, not about the answer.

mingyc commented 1 year ago

@noamr is right. The Motivation section from the explainer already makes it clear that we want an API to sending a bundle of data to a backend server, without expecting a particular response, ideally at the ‘end’ of a user’s visit to a page

annevk commented 1 year ago

What I'm saying is that in a large number of cases you won't know whether they are transmitted via this kind of feature as the page will not be in the back-forward cache.

@mingyc I don't see how that relates to determining success (and only in the back-forward cache case)?

noamr commented 1 year ago

Correct. But for those cases it's unnecessary as they would be transmitted when unloaded. You need this feature only if BFCache worked, to avoid sending duplicate data. BFCache-restore is the only case where your document is alive and you're not sure whether your fetch had been transmitted.

fergald commented 1 year ago

What I'm saying is that in a large number of cases you won't know whether they are transmitted via this kind of feature as the page will not be in the back-forward cache.

@mingyc I don't see how that relates to determining success (and only in the back-forward cache case)?

The goal is not to determine success or failure.

One use case is a beacon that accumulates data. You have some new data. If the beacon has not been sent, you cancel it and create a new one with the accumulated data. If the beacon has been sent you start a new one with this new data as the first piece of accumulated data.

You need a sent/not-sent signal in order to make decisions about beacons when they can spontaneously be sent.

annevk commented 1 year ago

Thanks for your patience in explaining this additional wrinkle. I discussed this with colleagues and we're supportive of addressing it. Here's an API that would make sense to us: a fetchLater() method that essentially takes the same arguments as fetch() (modulo additional parameters) and returns a promise for a dictionary containing a single boolean done member. That promise is only ever rejected with an exception or resolved with { done: true }. The latter can only happen after it's survived bfcache at least once.