WICG / attribution-reporting-api

Attribution Reporting API
https://wicg.github.io/attribution-reporting-api/
Other
352 stars 165 forks source link

Attribution-Reporting-Support for subresource trigger registration #552

Open johnivdel opened 2 years ago

johnivdel commented 2 years ago

The event explainer allows all subresources on a page to to respond with the Attribution-Reporting-Register-Trigger header, even without the Attribution-Reporting-Eligible header present on the request.

The cross app and web explainer proposes adding a new Attribution-Reporting-Support header on all requests which specify the Attribution-Reporting-Eligible header. This means it is not possible for some trigger requests to have access to the support header when deciding how to register.

Ideally there could be a mechanism that allows these generic subresource requests to gain access to the header.

One idea would be to allow a subresource request to respond with a new Accept-Attribution-Reporting header that indicates the support header should be added to subsequent redirects (somewhat similar to Accept-CH header).

sequenceDiagram
    Browser-->>adtech.example: https://adtech.example/register-conversion
    adtech.example-->>Browser: Location: register-conversion-redirect<br/>Accept-Attribution-Reporting:support
    Browser-->>adtech.example: https://adtech.example/register-conversion-redirect<br/>Attribution-Reporting-Support: os,web
     adtech.example-->>Browser: Attribution-Reporting-Register-OS-Trigger: ...

This allows the adtech to gain access to the support header, at the cost of an additional redirect. To avoid an additional redirect, the adtech may always add the eligibility header to the request registering the trigger.

For registration redirect chains, this also only requires one party in the chain to return the Accept-Attribution-Reporting header.

An alternative would be to add the header on all subresource requests, but exposing this information on all requests seems wasteful and increases fingerprinting potential as discussed in the explainer.

linnan-github commented 1 year ago

Discussed with @yoavweiss, this proposal requires integration with core algorithm in Fetch spec. An alternative is to add the Attribution-Reporting-Support header on the subresource request if there's an attribution source registered with the top-level site as destination and subresource origin as reporting origin. However, there may be privacy implications as this is cross-site data and may be leaked from whether the header is present in the request.

Would appreciate any feedback/discussion on these possible solutions. Thanks.

csharrison commented 1 year ago

An alternative is to add the Attribution-Reporting-Support header on the subresource request if there's an attribution source registered with the top-level site as destination and subresource origin as reporting origin. However, there may be privacy implications as this is cross-site data and may be leaked from whether the header is present in the request.

I am opposed to this alternative because it violates the privacy stance of the API. @yoavweiss why exactly does this require core Fetch integration?

linnan-github commented 1 year ago

An alternative is to add the Attribution-Reporting-Support header on the subresource request if there's an attribution source registered with the top-level site as destination and subresource origin as reporting origin. However, there may be privacy implications as this is cross-site data and may be leaked from whether the header is present in the request.

I am opposed to this alternative because it violates the privacy stance of the API. @yoavweiss why exactly does this require core Fetch integration?

Thanks Charlie. Fetch integration is needed as attribution-specific logic would be added to the redirect fetch algorithm (https://fetch.spec.whatwg.org/#http-redirect-fetch) to handle the attribution headers on the redirect response and the new request.

cc @domfarolino as well.

yoavweiss commented 1 year ago

I am opposed to this alternative because it violates the privacy stance of the API. @yoavweiss why exactly does this require core Fetch integration?

Thanks Charlie! I'd love to better understand your opposition, and the privacy leak.

IIUC, the current proposal enables any response on the reporting origin to respond with an Accept-Attribution-Reporting header, and then get the information about that support through a redirect. The alternative would be to send that information on such requests initially, without requiring the reporting origin to opt-in. At the same time, we could consider that the reporting origin opted in to this when it registered itself as a reporting source.

Am I missing some subtlety here related to passive vs. active entropy? Or something else entirely?

csharrison commented 1 year ago

I was opposed to the proposal in this snippet (emphasis mine):

An alternative is to add the Attribution-Reporting-Support header on the subresource request if there's an attribution source registered with the top-level site as destination and subresource origin as reporting origin. However, there may be privacy implications as this is cross-site data and may be leaked from whether the header is present in the request.

With this information, a site can trigger subresource requests and if there is an Attribution-Reporting-Support header, they know that the user visited a publisher that advertises to my site, which is explicitly cross site information. A bad actor could abuse this in a bunch of ways, for instance by logging sources only on sensitive.com to learn all of the sensitive.com users who visit their site.

yoavweiss commented 1 year ago

In the example above, adtech.example does get access to the information that publisher.example had an at pointing to it, right? If that is correct, does the opt-in (in the form of a redirect) somehow change the calculus?

I'm sure I'm missing something...

csharrison commented 1 year ago

If that is correct, does the opt-in (in the form of a redirect) somehow change the calculus?

The opt-in flow would add the header unconditionally, even if a previous impression was never shown to the user.

yoavweiss commented 1 year ago

I see! So adtech.example needs to know the attribution level support before it decides which attribution trigger to send? Does the attribution level support itself high-entropy? Or is that signal something we can consider as a low-entropy client hint?

csharrison commented 1 year ago

So adtech.example needs to know the attribution level support before it decides which attribution trigger to send

Something like that. The Attribution-Reporting-Support header reveals whether the client supports OS-level delegation of attribution operations, which affects how the ad-tech configures their response headers (opting into the supported OS-level delegation, for instance)

Does the attribution level support itself high-entropy

No, it is low entropy, basically just a boolean.

is that signal something we can consider as a low-entropy client hint?

Can client hints work like @johnivdel 's original proposal, where accept-ch headers affect the client hints that are served on the subsequent redirect?

yoavweiss commented 1 year ago

^^ @arichiv

In client hints, the internal redirect mechanism is reserved for critical hints of navigation requests, when the ACCEPT_CH frame failed us. So it's essentially acting as a fallback mechanism.

I wonder if we can define Attribution-Reporting-Support as a low-entropy CH (maybe a new category only sent on subresource requests), and that would enable adtech.example to reply with the right attribution without requiring a redirect.

linnan-github commented 1 year ago

^^ @arichiv

In client hints, the internal redirect mechanism is reserved for critical hints of navigation requests, when the ACCEPT_CH frame failed us. So it's essentially acting as a fallback mechanism.

I wonder if we can define Attribution-Reporting-Support as a low-entropy CH (maybe a new category only sent on subresource requests), and that would enable adtech.example to reply with the right attribution without requiring a redirect.

Thanks @yoavweiss. Sorry I'm not very familiar with client hints. Do you suggest that we can set the Attribution-Reporting-Support header on all subresource requests? Do we need explicit integration with CH?

As @johnivdel mentioned in the original post, "exposing this information on all requests seems wasteful and increases fingerprinting potential". Is this a concern? Thanks!

yoavweiss commented 1 year ago

I agree that defining this is a low-entropy CH would add passive fingerprinting data in non-ideal ways.

I need to think about this some more, but it seems to me that this proposal is trying to reinvent Client Hints because Client Hints cannot be opted in from a subresource response. The reason for that restriction is that we didn't want to provide passive resources access to data that they didn't already had (as they don't have access to JS APIs).

At the same time, maybe there's room to reconsider that historical decision for low-entropy hints. Maybe we could define a new class of low-entropy hints that can be opted-in from subresources.

^^ @arichiv and @miketaylr for thoughts

miketaylr commented 1 year ago

Maybe we could define a new class of low-entropy hints that can be opted-in from subresources.

Yeah, this is an interesting idea. I filed https://github.com/WICG/client-hints-infrastructure/issues/142 for more discussion.

johnivdel commented 1 year ago

There are a few other differences between this header and Client hint opt-ins which I think it would be good to call out.

A common pattern today is to have a subresource which redirects through a number of different third-partys who are all registering for the same event.

From my understanding, client hints only supports opting-in to receive a hint at an origin level, not a redirect chain level. So this may result in more ergonomic issues to adopt the API.

arichiv commented 1 year ago

Should this be a client hint or just a permissions policy? If it's for subresource requests only then any page could set a permissions policy that delegated the header to specific origins (or the wildcard). Is that reasonable? The benefits of client hints and their cache seems more about sending the header in an initial page load, but that doesn't sound like the issue here. Either way, a client hint requires a permissions policy so the question is do we need a client hint as well.

johnivdel commented 1 year ago

Should this be a client hint or just a permissions policy? If it's for subresource requests only then any page could set a permissions policy that delegated the header to specific origins (or the wildcard). Is that reasonable? The benefits of client hints and their cache seems more about sending the header in an initial page load, but that doesn't sound like the issue here. Either way, a client hint requires a permissions policy so the question is do we need a client hint as well.

One of the goals here is to avoid requiring major changes to the way adtechs are making these requests today. In many cases, the attribution request is embedded as an tag which makes the subresource request to the adtech server.

Permissions policy would require javascript/and or changes from the site embedding the adtech which can be very difficult: see the discussion on https://github.com/WICG/attribution-reporting-api/issues/558 for some more context.

This is why we favored an HTTP based mechanism to opt-in, as it is less work on adtechs to integrate with the API.

arichiv commented 1 year ago

I see, then I'll suggest something slightly different. That (1) a permissions policy delegated to * by default is added and (2) if the attribute in the sub-resource indicates (and a custom permissions policy isn't preventing) the attribution header can be sent. This isn't a client hint per-say. Though it does behave as a low entropy client hint there's no reason to add a way for top frame itself to receive it (which would be the difference in a formal client hint, adding a way for the top frame to persist it's own need for the header).