Open mingyc opened 10 months ago
Agreed - the current proposal seems like the best compromise.
I'm just thinking through the developer ergonomics. I want to have simple code that can utilize fetchLater()
without having to think too much. If I have to worry about my code encountering a QuotaExceeded during runtime (after starting out OK), it has to then deal with understanding what that means, and work around it (e.g. switch to sendBeacon()
later instead or something). We could provide some sample code that shows how devs may want to deal with this scenario, for example.
The other option I was thinking was if a reporting origin ever registered a fetchLater()
it would be assigned a minimum 64kb that it could use up to. Then, as long as QuotaExceeded isn't encountered at the first fetchLater()
, scripts wouldn't have to worry about it firing later down the line. But then that's essentially a 10-reporting-origins-limit instead.
Yoav and I were discussing whether or not we could estimate how often 10+ reporting origins might happen in the wild. I was thinking I could look at our RUM ResourceTiming data for beacon
initiators, but then there's a lot of scripts that still use fetch()
with keepalive, or XHR, or IMG gets. Still, I might try to see what I can estimate.
Agreed - the current proposal seems like the best compromise.
I'm just thinking through the developer ergonomics. I want to have simple code that can utilize
fetchLater()
without having to think too much. If I have to worry about my code encountering a QuotaExceeded during runtime (after starting out OK), it has to then deal with understanding what that means, and work around it (e.g. switch tosendBeacon()
later instead or something). We could provide some sample code that shows how devs may want to deal with this scenario, for example.The other option I was thinking was if a reporting origin ever registered a
fetchLater()
it would be assigned a minimum 64kb that it could use up to. Then, as long as QuotaExceeded isn't encountered at the firstfetchLater()
, scripts wouldn't have to worry about it firing later down the line. But then that's essentially a 10-reporting-origins-limit instead.Yoav and I were discussing whether or not we could estimate how often 10+ reporting origins might happen in the wild. I was thinking I could look at our RUM ResourceTiming data for
beacon
initiators, but then there's a lot of scripts that still usefetch()
with keepalive, or XHR, or IMG gets. Still, I might try to see what I can estimate.
If we enforce a 10-origin limit, you could register early by fetching an empty (GET?) request and replacing it with a real payload when you have it. It's not even that wasteful. But I suggest that perhaps we can proceed with what we spec'ed here, and introduce this extra limitation if we see that this becomes a problem?
@yoavweiss We need more developer feedback before settling on this approach. Is it possible for you to bring this up in the next meeting?
We discussed on the April 25th W3C WebPerfWG call. Meeting minutes are located here.
Summary:
No big decisions were made, but the request was for everyone interested to comment in this issue.
In the meantime I updated the fetch PR, to have the quota take the URL+headers into account.
https://github.com/WICG/pending-beacon/issues/87#issuecomment-1985358609
The default permission policy is anyway not something in the spec, but in most/all cases it should be self.
@arturjanc @yoavweiss @noamr We are concerned that setting the default deferred-fetch
permissions policy to self
may significantly reduce the usage of this new API from ads/tracking lib parterners, as not many users knows how to properly set that up for 3p iframes.
Is it possible to set this to *
by default?
https://github.com/WICG/pending-beacon/issues/87#issuecomment-1985358609
The default permission policy is anyway not something in the spec, but in most/all cases it should be self.
@arturjanc @yoavweiss @noamr We are concerned that setting the default
deferred-fetch
permissions policy toself
may significantly reduce the usage of this new API from ads/tracking lib parterners, as not many users knows how to properly set that up for 3p iframes.
Aren't those partners iframes created by a 3p script at the parent?
But it still requires script users to set correct permissions policy in their server?
Aren't those partners iframes created by a 3p script at the parent?
But it still requires script users to set correct permissions policy in their server?
No, just
Thanks Noam.
That means that every 3p iframe would take 64kb of the quota.
If we are okay with this, do you think it is possible for us to push deferred-fetch's default to '*'?
Hi all,
Chiming in from an ads perspective. Reliability is of utmost importance for us.
Regarding origin related permissions:
The usefulness of FetchLater is that it will allow us to send our data in a single beacon at the end of the session, as opposed to many HTTP requests throughout the page life cycle. The problem for us is that we do not control the domain stack we are rendered into because we are often served and rendered with the ad. In most cases, we are rendered into a 3P iframe that we do not control meaning we cannot control the “deferred-fetch” permissions.
If FetchLater is only available in 3P iframes when explicitly permitted with “deferred-fetch” AND we can detect the non-permitted case, then we would have to fallback to the “multiple HTTP requests” solution for that traffic (meaning two event processing stacks would need to exist).
Worse, if FetchLater is only available in 3P iframes when explicitly permitted with “deferred-fetch” AND we CANNOT detect the non-permitted case, we will not be able to adopt this API at all. Knowing that the API is less reliable than stream solutions, would make the API a non-starter for us.
Regarding the size budget:
The beacon size limitation is less of a concern, under the assumption that it would be a constraint in exceedingly rare situations; however, if it turns out it happens often, that would also warrant us to revert to the multi-beacon approach. A size limitation feels like a premature optimization and seems like something we could investigate at a later point in time if there are issues in production.
Hi all,
Chiming in from an ads perspective. Reliability is of utmost importance for us.
Regarding origin related permissions:
The usefulness of FetchLater is that it will allow us to send our data in a single beacon at the end of the session, as opposed to many HTTP requests throughout the page life cycle. The problem for us is that we do not control the domain stack we are rendered into because we are often served and rendered with the ad. In most cases, we are rendered into a 3P iframe that we do not control meaning we cannot control the “deferred-fetch” permissions.
If FetchLater is only available in 3P iframes when explicitly permitted with “deferred-fetch” AND we can detect the non-permitted case, then we would have to fallback to the “multiple HTTP requests” solution for that traffic (meaning two event processing stacks would need to exist).
Worse, if FetchLater is only available in 3P iframes when explicitly permitted with “deferred-fetch” AND we CANNOT detect the non-permitted case, we will not be able to adopt this API at all. Knowing that the API is less reliable than stream solutions, would make the API a non-starter for us.
This will always be detectable using the permissions policy API, or by trying to call fetchLater
and catching the appropriate error.
Regarding the size budget:
The beacon size limitation is less of a concern, under the assumption that it would be a constraint in exceedingly rare situations; however, if it turns out it happens often, that would also warrant us to revert to the multi-beacon approach. A size limitation feels like a premature optimization and seems like something we could investigate at a later point in time if there are issues in production.
Thanks for this feedback!
self
effectively means we are excluding ads from ever using this API. I'm not sure if there's any other common use case that finds itself embedded in a stack of 3P frames.
This is a major loss and does not seem worth it to prevent some hypothetical quota stealing. If bad-acting 3rd parties turn out to be a real problem, it is easy for the top-level site to control that with a policy.
With regard to making progress and launching with self
and reviewing things later and maybe making it *
, is that realistic? If we launch with self
, sites that care about controlling this will do nothing and if we switch to *
it will require action from those sites. The switch to *
is not something we can do lightly.
@jakeherron-google can you be more specific about "we do not control the domain stack we are rendered into because we are often served and rendered with the ad"? How does this work more in detail?
Note that the Permissions Policy API does not have cross-browser agreement. (I thought the plan was to merge it into the Permissions API but to what extent that has happened is unclear.)
@jakeherron-google can you be more specific about "we do not control the domain stack we are rendered into because we are often served and rendered with the ad"? How does this work more in detail?
Yes, to be more specific we deploy Javascript alongside the ad that conducts measurement throughout its lifecycle. The actual ad content along with any scripts will be served into a 3P iframe that the publisher webpage has control over. The publisher does not know anything about the ad served into the 3P frame, by design. Likewise, our ad measurement script has no control over this iframe by design. Our use of the fetchLater API would be beholden to the publisher opting into deferred-fetch for our ad’s iframe under the proposed permissions policy.
Summarizing internal conversations about this: There is a strong use case for this feature to have a small quota available for 3p origins without requiring permissions policy. This use case is important for us (Google/Chromium).
This quota can be small, e.g. 16kb per 3p-origin-in-a-top-level-document. We see the risk of using this "spare change fund" quota for abusing bandwidth after window close as small, and is anyway possible in other means today such as ordinary keepalive
requests (which don't have a standard-defined top-level quota) or service workers (which may be terminated when they don't have active clients, but in practice can send/receive quite a bit of data).
So proposing to keep the current spec wording, but to add a per-3p-origin quota to the effect of 16kb that doesn't require permissions policy. Allowing an iframe to use the overall quota would have the effect of increasing it to 64kb.
We'd prefer to have this as a normal part of the spec, but also open to having it implementation-defined or a MAY/SHOULD that allows user-agents to choose their tradeoff between utility and risk of bandwidth abuse. FWIW I think it's a legitimate tradeoff to leave for browsers to differentiate, and is discoverable enough so that it shouldn't create a substantial interop issue.
@annevk thoughts?
We discussed this once more internally.
We continue to be concerned about letting third parties consume end user system resources without explicit buy-in from the first party.
To the extent that is possible today through service workers that will likely be something we will eliminate going forward (the main reason for keeping service workers alive in WebKit has been to make navigation quicker, not for web application shutdown purposes). And thus we would not want to use that as precedent when designing a new API.
We discussed this once more internally.
We continue to be concerned about letting third parties consume end user system resources without explicit buy-in from the first party.
Understood. I suggest that having this as a MAY in the spec is a better way forward than remaining blocked on this, as allowing 3ps to use this feature without 1p buy-in is an important use case for us (with a very small quota).
We'd prefer to tackle hypothetical problems that arise from this, keeping an eye on metrics, rather than constrain this API in advance in a way that would make it a lot less useful. Since this API is anyway built to not always work based on all kinds of constraints, I think having this as a constraint that's up to some implementation-specific discretion is probably the best we can come up with.
To the extent that is possible today through service workers that will likely be something we will eliminate going forward (the main reason for keeping service workers alive in WebKit has been to make navigation quicker, not for web application shutdown purposes). And thus we would not want to use that as precedent when designing a new API.
Apart from service-workers, another precedence is ordinary keep-alive
fetches, which can be used by a 3p without 1p buy-in. The quota proposed here is smaller than the quota of a single keep-alive fetch. Is WebKit also going to eliminate keep-alive fetches in 3p iframes?
Updated the PR, to include a 3p default quota of 16kb even if the permissions policy is not present (in a MAY clause). This is currently the only way I see forward given lack of consensus about this detail, which is really a small bit of this API. I think it's better to have one detail with an implementation-defined difference than hold this issue back indefinitely or ship it with a monkey-patch spec. Open to hear other alternatives!
Open questions from last meeting:
Some concerns:
Relevant Discussions: https://github.com/w3c/beacon/issues/38#issuecomment-1861006131
@noamr @annevk @nicjansma @yoavweiss