patcg-individual-drafts / ipa

Interoperable Private Attribution (IPA) - A Private Measurement Proposal
Other
35 stars 17 forks source link

Match keys without JavaScript (for browser implementations) #25

Open eriktaubeneck opened 2 years ago

eriktaubeneck commented 2 years ago

The proposal currently suggests that (when in a browser) the browser implement a new function, get_encrypted_match_key(). However, many sites serve ads on publisher websites without having the ability to execute JavaScript (even in an iframe.)

We should explore ways in which a delegated report collector, acting on behalf of either the publisher or the advertiser, can receive an encrypted match key without requiring the ability to execute JavaScript on a website.

csharrison commented 2 years ago

This is sort of how the Attribution Reporting API works. We made a design decision that we wanted the top-level site to have some kind of opt-in to this process (even if the network request originated in the top-level frame), so we designed a scheme that required mark-up changes if you don't have JS: https://github.com/WICG/attribution-reporting-api/blob/main/EVENT.md#registering-attribution-sources

This is for registering sources, but the overall technique I think works in the match key setting too.

csharrison commented 1 year ago

Here is a strawman HTTP-based API for IPA.


For setting a match key, we can introduce a new response header that will work similar Cookies:

Set-Attribution-Match-Key: <value>

This will set a match key associated with the origin of the response. Unlike cookies, we should try not to send match keys by default on network requests, so we can introduce various opt-in methods, like on fetch:

fetch(<url>, {
  // Sends match keys for HTTP requests associated with this fetch.
  includeAttributionMatchKey: “<provider origin>”
});

This would attach the following structured header:

Sec-Attribution-Match-Key: <byte sequence>, provider=<origin>

Additionally, to support cases where the precise providers are not know at request start, we could introduce an API that automatically adds the Sec-Attribution-Match-Key request header on a subsequent HTTP redirect:

HTTP/1.1 302 Found
Location: https://redirect.example/
Include-Attribution-Match-Key: <provider origin>

Note that currently with ARA, we made an adoption compromise for conversion tags which do not require an opt-in mechanism: https://github.com/WICG/attribution-reporting-api/issues/347. IPA may need to make similar compromises for adoption, e.g. allowing the redirect path without any HTML / JS modifications.

martinthomson commented 1 year ago

How critical is it to have the ability to set match keys in HTTP? It seems to me like the primary API here is the "include attribution match key" instructions for fetch and 302 responses.

csharrison commented 1 year ago

How critical is it to have the ability to set match keys in HTTP? It seems to me like the primary API here is the "include attribution match key" instructions for fetch and 302 responses.

We've seen a few reasons why an HTTP-based API is preferred over a JS-based one:

  1. Some deployments don't have any JS at all (e.g. pixel tags). For some advertisers like banks this is part of their security stance (no 3P JS at all). It's worth noting that PCM's design also supports this use-case for trigger registration.
  2. Some report collectors might not be embedded on the page at all, they are only notified of events via HTTP redirects. This is a primary method for 3rd party measurement providers to be deployed. In IPA, an advertiser may want to have that third party provider be (one of) their report collector(s).
  3. It is often useful to be able to have more fine-grained attribution (for lack of a better word) about a caller than just the document origin. It isn't clear how getEncryptedMatchKey works but if the report collector is implicitly specified based on the document origin that's hard to adopt. Much easier to use the origin associated w/ a request.
  4. Most existing ads systems work off of HTTP, so it'll be easier to adopt. In practice, if encrypted match keys are delivered directly to JS, they will most likely find their way to ad-tech servers via getting embedded in the URL.
martinthomson commented 1 year ago

Those reasons all apply to the use of match keys more than the setting of them. My question was about setting. I have no concern about providing an HTTP API (and DOM or Fetch hooks to match) for getting.

csharrison commented 1 year ago

Sorry, totally misread your comment 🤦. I agree setting should mostly be OK with Javascript as long as there is a browser fallback match key that replaces the use of "pseudonymous" 3p cookies that are just browser ids.

martinthomson commented 1 year ago

So I think that we definitely need to work through doing this. Not all placements will have the necessary script access to be able to manage this properly. A few things to work through:

  1. Fetch integration is a bit of a pain. I say this with the greatest respect for those people that maintain fetch, but it is a giant Rube-Goldberg machine that requires a fair bit of care in extending.
  2. Permissions Policy still needs to apply, even to fetches. Defaults and what needs to be configured mean more questions.
  3. Doing HTTP field definitions well is non-trivial.
  4. The origin stuff you refer to is going to be tricky to sort out. Knowing which fetches to decorate without adding more round trips sounds challenging. We could decorate all of those that are permitted, I guess, but that comes with its own costs.

Aside from that, as long as we can model this as calls to the API (or some logical abstract API) that happen automatically under predefined conditions, then we are in a good shape.

To be clear, I have no interest in doing that work for setting match keys. I might even be opposed to adding that in, but mostly just on the grounds of platform complexity.