w3c / beacon

Beacon
https://w3c.github.io/beacon/
Other
46 stars 22 forks source link

How important is it for sendBeacon() to follow redirects? #49

Closed cdumez closed 6 years ago

cdumez commented 6 years ago

I see that the Beacon specification indicates that the Fetch request used by sendBeacon() should follow redirects. Is this something that we know is required? Are people relying on this?

My understanding is that Beacon is supposed to be a very lightweight networking API. This is important given that beacon requests can outlive the page. In WebKit, one of the aspects of being lightweight would be to not require our WebProcess to stay alive. This means we would hand off the request to our NetworkProcess and that's it. The NetworkProcess would takes things over from there.

However, Support for redirects add a lot of complexity because:

  1. We potentially need to do CORS preflighting on redirect
  2. We need to do CSP checks on redirects
  3. We need to query our content blockers (e.g. ad blockers) on redirects
  4. We need to ask our client (via our API) if they want to allow each redirect

Doing all this while not requiring our WebProcess to stay alive is not trivial. This means that for us, sendBeacon() is not that lightweight anymore.

For this reason, I wanted to start a discussion here to get feedback from editors and other implementors on this aspect of the API.

igrigorik commented 6 years ago

In practice, we found that support for redirects is critical. A few older discussions that document use cases and findings: https://github.com/w3c/beacon/issues/22#issuecomment-171109066, https://github.com/w3c/beacon/issues/22#issuecomment-234665448.

For better or worse, I think it's a must.

beidson commented 6 years ago

In both of those comments the only concrete use case I read was "some services invoke redirect chains, because they need the same beacon to be registered against multiple backends"

We're limited to the somewhat arbitrary redirect count of 19 (20 urls hit). What if somebody wants to support 21 backends? What if we'd limited to 29 redirects and somebody wanted to support 31 backends? What if we'd limited to 1 redirect and somebody wanted to support 3 backends?

The only non-arbitrary number of URLs that could be promised is 1 - the first. Any provider where registering a beacon hit against an arbitrary number of backends should manage the registration themselves.

In fact it kind of seems like bad design to rely on the client to hit 21 different URLs (or even 2 different URLs) if the registration is at all critical.

Is it truly tenable to support this use case?

igrigorik commented 6 years ago

Is it truly tenable to support this use case?

Yes. Supporting redirects does not meant you have to support an infinite set of redirects; not supporting an infinite set of redirects does not mean you should abandon them entirely either. In Chrome we issue an error after 20 redirects.

beidson commented 6 years ago

In Chrome we issue an error after 20 redirects.

Right, because that's what the Fetch spec says, and why it was my starting point.

But my point is... that's arbitrary.

The use case of "some services invoke redirect chains, because they need the same beacon to be registered against multiple backends" is arbitrary because it doesn't describe in any concrete terms how many is required and doesn't describe why this is a responsibility of the client and not the backend.

Is there any concrete data on this? I see your comments that this is important. But just a statement with nothing backing it up. I'm definitely not groking what about beacons makes "spreading the registration to multiple backends" a responsibility of the client instead of the backend.

Perhaps one absolute, concrete, real world use case could help illustrate?

igrigorik commented 6 years ago

The use case of "some services invoke redirect chains, because they need the same beacon to be registered against multiple backends" is arbitrary because it doesn't describe in any concrete terms how many is required and doesn't describe why this is a responsibility of the client and not the backend.

Less than 20. Yes, arbitrary, but well defined and enforced arbitrary :-)

I'm definitely not groking what about beacons makes "spreading the registration to multiple backends" a responsibility of the client instead of the backend.

We're not talking about technical feasibility, we're talking about what you see on modern web, and enabling existing ecosystem to move to a more efficient solution that moves this work out of the critical path. My experience, from working with large analytics and ads vendors, is that they need this functionality for a feasible migration path. I would be strongly opposed to arbitrarily removing this support, especially in light of the fact that existing non-sendBeacon APIs don't place any such limitations.

geoffreygaren commented 6 years ago

My experience, from working with large analytics and ads vendors, is that they need this functionality for a feasible migration path.

Can you cite an example?

leosei commented 6 years ago

Hi guys, (Leo here, the Google AdWords Product Manager for our URL tracking, among other things).

We're transitioning ad tracking to sendbeacon and not having it follow redirect would make it totally inusable for us. Let me illustrate why.

Most advertiser will have not one but multiple click tracking partners as they offer various benefit. Some partners may focus on creative optimization, other on bidding and other maybe on audience & CRM analysis. All those need to register the click for their analysis and the way this is done today is by daisy chaining the redirects between the ad and the landing page.

So let's take an example where an Advertiser is using Doubleclick search (aka DS) for it's bidding, he also has Doubleclick Campaign manager (aka DCM) for it's overall reporting and use Mediaplex (now conversantmedia I believe for audiences analysis). In real life we've seen up to 15 partners but the common cases are 3-4.

We, Google AdWords, only know about the first tracking URL (DS in that case) and the landing page. Our plan is to send the ad click directly to the landing page, while using sendbeacon for the tracking URL (DS in that case). The expectation are that sendbeacon would then follow the DS redirect to DCM and then DCM would redirect to mediaplex. Each hop only know about the next one in the majority of case so it would be nearly impossible to keep those tracking working without following redirects.

We're very eager to move the entire ad eco-system ($XXB annually) to using sendbeacon but for us, following the redirect is a P0 priority.

I'm more than happy to provide more example if needed. Leo

igrigorik commented 6 years ago

@leosei thanks for the detailed example and context.

@cdumez @geoffreygaren @beidson I don't believe we can remove support for redirects here, nor should we special case or handicap sendBeacon in this regard as compared to existing methods in use by various vendors. I propose we close this with no spec action. Does that sound reasonable?

geoffreygaren commented 6 years ago

@igrigorik OK

igrigorik commented 6 years ago

Great, thanks! Closing, feel free to reopen if there's more to discuss here.

beidson commented 6 years ago

I'm also fine with closing this issue.

For our understanding of this spec, the Fetch spec, and directions that future specs might take, I'd like to dig in to this point a little more:

So let's take an example where an Advertiser is using Doubleclick search (aka DS) for it's bidding, he also has Doubleclick Campaign manager (aka DCM) for it's overall reporting and use Mediaplex (now conversantmedia I believe for audiences analysis). In real life we've seen up to 15 partners but the common cases are 3-4.

So, for the record, do we figure that the seemingly arbitrary "20 redirect limit" of the fetch algorithm was put in place as it covered the case of "large trackers/advertisers currently need to hit around 15 tracking partners, so 20 comfortably covers that"?

If so, what happens when that 15 grows to 21? Does the large tracker/advertiser start doing things server-side then?

leosei commented 6 years ago

In practice, 10+ is more of an edge case (common case is in the 3-4 range). From what I've seen, the trend for large number of daisy chaining is to consolidate trackers (and large ones are exploring server side tracking as well) so I don't think the 15 number is something that would grow in the future.