privacycg / private-click-measurement

Private Click Measurement
https://privacycg.github.io/private-click-measurement/
200 stars 8 forks source link

Modern JS API design, additional privacy considerations #37

Open masegraye opened 4 years ago

masegraye commented 4 years ago

TL;DR:

Q1: For the JS API, does PCM expect advertisers to calculate the conversion bits dynamically (e.g., potentially delegate the specific strategy to the ad click source)? Or does PCM want to encourage this code to only be loaded from a 1st party (advertiser) script?

Idealized example, on an adDestination page:

// can facebook.com specify this result?
let [facebookConversionBits, facebookPriority] = await getBitsFromFacebook('Purchase', 10.00); 

// can google.com specify this result?
let [googleConversionBits, googlePriority] = await getBitsFromGoogle('Purchase'); 
window.tracking.fireConversion(facebookConversionBits, facebookPriority, "facebook.com");
window.tracking.fireConversion(googleConversionBits, googlePriority, "google.com");

Q2: Are there any privacy considerations not present in the PCM draft that will inform the shape and function of the JS API? Have those been enumerated somewhere?

Context:

As we explore PCM and look at supporting it via the legacy API, we want to make sure we’re not taking a hard dependency on functionality offered by the legacy API that might be intentionally absent in the JS API.

For example, in WICG/ad-click-attribution#31, @johnwilander notes:

We don't want more third-party scripts in the first-party context either which is why we intend to have an optional restriction on the calls to this JavaScript API where cross-site scripts will not be able to use the API. There are no browser engines with JavaScript source tracking yet, but there may be in the future and we should acknowledge that to disincentivize sites from importing cross-site scripts to do conversion calls.

[...]

We want there to be a JavaScript API for this and that browsers are allowed to restrict calls to that API to only true first-party scripts.

One of the things we’ve been exploring is being able to dynamically determine the conversion bit strategy on a per-eTLD+1 basis, possibly on the number and types of campaigns that are active for that eTLD+1.

The legacy API allows this because the legacy pixel calls may self-report the domain, conversion type, conversion value, etc—all of which could subsequently get packed into an appropriate 6 bit representation—informed by the aggregate campaign information that we (the ad click source) have at the time we receive the pixel fire.

But given John’s comment, I’m curious if this is viable long-term. Would there be a distinction between “scripts that can trigger the conversion API” and “scripts that can configure the conversion bit strategy”? Are there any assumptions/goals around explicitly preventing network calls in the code path that determines the conversion bits?

masegraye commented 4 years ago

Hi @johnwilander - any guidance you could offer here is appreciated.

johnwilander commented 4 years ago

Hi and thanks for filing! Sorry for the delay. It was a little hard for me to follow this so I had to make some time to read it slowly.

TL;DR:

Q1: For the JS API, does PCM expect advertisers to calculate the conversion bits dynamically (e.g., potentially delegate the specific strategy to the ad click source)? Or does PCM want to encourage this code to only be loaded from a 1st party (advertiser) script?

I don't follow how "calculate dynamically" is linked to the party serving the script. Both a first and a third party could dynamically or statically set the bits.

Idealized example, on an adDestination page:


// can facebook.com specify this result?
let [facebookConversionBits, facebookPriority] = await getBitsFromFacebook('Purchase', 10.00); 

You'd have to assume Facebook doesn't get access to any of its first-party state which would potentially identify the user here so that has to be a static function. I don't really see why a third-party would decide what the conversion bits should be. A conversion is solely an adDestination first party matter. Or do you see it otherwise?

// can google.com specify this result? let [googleConversionBits, googlePriority] = await getBitsFromGoogle('Purchase'); window.tracking.fireConversion(facebookConversionBits, facebookPriority, "facebook.com"); window.tracking.fireConversion(googleConversionBits, googlePriority, "google.com");



**Q2:** Are there any privacy considerations not present in the [PCM draft](https://wicg.github.io/ad-click-attribution/index.html) that will inform the shape and function of the JS API? Have those been enumerated somewhere?

## Context:
As we explore PCM and look at supporting it via the legacy API, we want to make sure we’re not taking a hard dependency on functionality offered by the legacy API that might be intentionally absent in the JS API.

For example, in #31, @johnwilander notes:

> We don't want more third-party scripts in the first-party context either which is why we intend to have an optional restriction on the calls to this JavaScript API where cross-site scripts will not be able to use the API. There are no browser engines with JavaScript source tracking yet, but there may be in the future and we should acknowledge that to disincentivize sites from importing cross-site scripts to do conversion calls.
> [...]
> We want there to be a JavaScript API for this and that browsers are allowed to restrict calls to that API to only true first-party scripts.

When we're discussing this, I try to take a fairly long perspective to make sure the modern API can serve the web well for years to come.

A general movement to ensure good privacy on the web is and has been to reduce third-party powers over websites. For the purposes of who can call an API, such power reduction can be achieved by removing third-party scripts in the first party context or by limiting the powers of third-party scripts in the first party context.

The latter is already being discussed in the W3C Privacy CG through Brave's proposal called JS Membranes. We also had a lengthy discussion on the topic at the Dagstuhl Seminar on web application security in 2018.

Signaling a conversion and thus spending any pending ad clicks should be under the true first party's control. Thus, I'd like to specify the modern triggering API so that browsers are allowed to restrict calls to it to the true first party.

One of the things we’ve been exploring is being able to dynamically determine the conversion bit strategy on a per-eTLD+1 basis, possibly on the number and types of campaigns that are active for that eTLD+1.

The legacy API allows this because the legacy pixel calls may self-report the domain, conversion type, conversion value, etc—all of which could subsequently get packed into an appropriate 6 bit representation—informed by the aggregate campaign information that we (the ad click source) have at the time we receive the pixel fire.

But given John’s comment, I’m curious if this is viable long-term. Would there be a distinction between “scripts that can trigger the conversion API” and “scripts that can configure the conversion bit strategy”? Are there any assumptions/goals around explicitly preventing network calls in the code path that determines the conversion bits?

I don't think we're exploring restrictions on data flow so that a third-party would not be able to supply the bits for the conversion ID. That's an interesting idea but it's not being considered right now. However, allowing a browser to say that only scripts from shop.example are allowed to trigger conversions on shop.example is on the table.

masegraye commented 4 years ago

Thanks for your reply. Sorry my last post was hard to follow. I’ll try to be more direct.

You'd have to assume Facebook doesn't get access to any of its first-party state which would potentially identify the user here so that has to be a static function. I don't really see why a third-party would decide what the conversion bits should be. A conversion is solely an adDestination first party matter. Or do you see it otherwise?

Sort of. There are three reasons to delegate here: best representation, ease of updates, and ease of migration.

Best Representation

An adDestination is the best party to decide a semantic conversion—what button represents a purchase, what field submission represents a subscription, etc. But that’s different from deciding the PCM bit-representation, which could vary per adSource.

An example: If a marketer runs 1 campaign on Google for just 4 pairs of shoes, it may be better to use those 12 bits to represent very granular price buckets. If they run 1 campaign on Facebook for men’s clothing, you may use them to represent category purchases with less granular value buckets. Both are ‘Purchases’, but a marketer is trying to glean different information from each. An advertiser may run one campaign or many, on one ad network or many at the same time.

Given this, it’s useful to be able to specify the conversion bit encoding strategy on a per-adSource basis, rather than per-adDestination basis, since the encoding can be tweaked to align with the specific marketing activity. For a given adDestination, 1) Facebook’s best representation of a purchase in that 12-bit form may not be the same as Google’s, and b) either may independently vary over time. This is a given with the legacy API. I want to make sure it will carry through to the modern JS API. (Or understand why it won’t.)

Ease of Updates

While useful, this per-adSource strategy also gets complicated quickly. Marketers are not developers. Deciding a bit encoding strategy is not their core competency, neither is making code changes to support it. And as soon as a developer needs to be looped into a marketing activity, you have a lot of losers and real no winners. It’s pure process overhead. Marketers can’t get their campaigns out quickly. Developers are randomized by experimental marketing tweaks. Users receive no increase in privacy.

By delegating this to a 3rd party, Facebook can remove complexity for advertisers while meeting PCMs stated goals. Again, this is a given with the legacy API.

Ease of Migration

Signaling a conversion and thus spending any pending ad clicks should be under the true first party's control. Thus, I'd like to specify the modern triggering API so that browsers are allowed to restrict calls to it to the true first party. … I don't think we're exploring restrictions on data flow so that a third-party would not be able to supply the bits for the conversion ID. That's an interesting idea but it's not being considered right now. However, allowing a browser to say that only scripts from shop.example are allowed to trigger conversions on shop.example is on the table.

Does this sentiment apply to the call stack initiator, the immediate caller, or the entire call stack? For example, can the call stack be 1P -> 3P -> API? Or does it need to be 1P -> 1P -> … -> API?

You can imagine Facebook’s existing fbevents.js being repurposed as a protocol adaptor for PCM. Millions of websites currently have 1P annotations of fbq(‘track’, ‘Purchase’), which calls the 3P fbevents script to make today’s legacy pixel call. This script could instead call the JS API directly, without the legacy pixel call, i.e., 1P -> 3P -> Modern API.

This helps reduce the number of events lost due to network connections closing at the point of navigation, brings us into closer alignment with PCM’s long-term goals, and helps us move a whole bunch of users over to PCM’s ‘modern’ API in one cutover without requiring many thousands of advertisers update their websites.

But this only works if 1P won’t be a hard requirement. Is the thinking that this restriction is something a site would signal to a browser via a CSP-like mechanism (like JS Membranes)? Or is this a browser-only decision?

masegraye commented 4 years ago

@johnwilander Any early thoughts/feedback on my Ease of Migration section? Is the intent for the modern API spec to say, "browsers may let site owners require 1st party access only" or "browsers may require 1st party access only?"