WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
513 stars 216 forks source link

Securely Propagating Auction Signals #119

Open jeffkaufman opened 3 years ago

jeffkaufman commented 3 years ago

We're considering integrating FLEDGE with a flow that looks like:

  1. The ad tag on the client sends a traditional contextual ad request.

  2. It receives a contextual ad response, and also auction signals generated on the server, such as the estimated likelihood of the slot being viewable.

  3. The tag invokes the FLEDGE APIs to run an interest group auction, providing those signals.

We are concerned that other scripts on the page could extract auction signals from the contextual ad responses. These signals are the outputs of complex proprietary models, and access to high-quality bidding signals is one reason buyers choose a particular sell-side platform.

Could FLEDGE provide a path for the server to provide auction signals to the browser auction, without making them accessible to other scripts on the publisher page? (We understand that this would not protect signals from headless browsers or manual inspection, and we think these vectors should be followed up separately.)

One way this could work would be if the worklet processing the decision logic began executing before the auction. The worklet could send a request to the ad server to fetch additional signals to append to auction_signals, which the server would restrict to the worklet's origin with an ACAO response header. If other scripts on the page attempted to read the response, they would be blocked by the same-origin policy.

Unfortunately this requires an additional round trip to fetch the signals. We could avoid that by having the contextual ad response be a web bundle, where the signals are provided in an opaque-origin resource, with an ACAO header limiting response access to the seller's origin. The ad tag could pass the opaque resource's URL to the decision logic worklet through auction_signals, and the signals fetch could be fulfilled from the bundle.

michaelkleber commented 3 years ago

Just to be sure I understand: your goal is for the seller to receive some opaque blob with the contextual response, which the seller's domain (e.g. its worklet) can open up and then pass along to the buyers in its auction? So you are trying to protect the contents of this blob from use outside the auction you're running, but are willing to allow free use within that auction?

This seems entirely reasonable as a goal. I'll need to work with some other Chrome engineers to figure out the technical details of how to support it, though.

jeffkaufman commented 3 years ago

Yes, that's right!

jeffkaufman commented 2 years ago

@michaelkleber This is still something we're interested in -- is this something you've been able to give more thought to?

I think this is probably something that could be built on top of Subresource Bundles.

michaelkleber commented 2 years ago

Hi @jeffkaufman : We have not been pursuing this approach. I presume, from your bumping this issue, that the use cases you're considering cannot be addressed instead by use of the seller's Key-Value server?

@jensenpaul if you have thoughts on how to consider this for future planning.

jeffkaufman commented 2 years ago

Yes, we have signals that we don't want to expose to other scripts running in the publisher JS context, and these signals depend on the contextual ad request and so can't come from the K-V server.

sbelov commented 2 years ago

Being able to privately propagate auction signals to the FLEDGE auction also seems relevant in the context of discussions in #59 and #202 on how multiple sellers might be supported: different sellers who work with a given publisher and may have code running on that publisher’s pages might wish to keep their own auction signals private and not readable by other sellers. While a seller could invoke runAdAuction within an iframe, isolating their signals from anyone outside the iframe, iframe-based isolation does not seem possible with some of the multi-seller support proposals.

JensenPaul commented 2 years ago

When the contextual signals are returned (the ones that you’re asking to securely propagate), can they be exposed to JS momentarily, so they can be passed from, for example, the XHR result to a new API to convert them to an opaque blob?

Is the goal that one blob of signals are passed to all of a seller’s bidders?

jeffkaufman commented 2 years ago

@JensenPaul if the signals are exposed momentarily, then other JS running on the page can read them, so I don't think that works? Let me write something up describing a few ideas for how to implement this and get back to you?

Is the goal that one blob of signals are passed to all of a seller’s bidders?

Some signals are for a seller's bidders (ex: "how likely is this slot to meet the ActiveView criteria") and those could go to all. Other signals are for the seller themself in scoring bids (ex: "how valuable is this slot on this particular page right now").

jeffkaufman commented 2 years ago

@JensenPaul Ok, here are four potential approaches to protecting server-generated contextual signals from scripts running on the publisher page:

  1. Run the auction inside a cross-domain iframe. The seller can either request the signals from within the iframe or with <iframe src="https://signals-url">. Unfortunately, this only works for the single-seller case. If you have multiple sellers (component auctions, #251) you have the problem that all of the signals, at some point, need to end up in the same JS context so they can be passed to runAdAuction.

  2. Use cryptography. The browser could provide a public key (per-site or per-pageview), and sellers could include that key on their requests for contextual signals. The key would be supplied in a header, to prevent an MITM attack where attacking JS substitutes a different public key. In their responses sellers could include signals encrypted against that key. Each seller would include these encrypted signals in their auction configuration, and the browser would decrypt them before making them available to the worklets. On the other hand, cryptography should not be necessary for this use case, since sellers just need some way to provide signals to the browser with a request that they only be exposed to their Turtledove worklets

  3. Use additional network requests. The API to initiate an auction could be extended to allow each seller to provide some signals by URL. The browser would fetch these URLs and make the results available to the seller's worklets, with a request header like Sec-Fetch-Dest: turtledove so the seller would know that their responses would not be accessible to non-Turtledove readers. This does add latency, however, with the signals fetch requiring a round trip to the server before the auction can begin.

  4. Use subresource bundles, as in https://github.com/WICG/webpackage/issues/624. This is an extension of (3) that fixes the latency issue. Each seller would format their contextual response as a web bundle, which would include both their contextual ads and opaque turtledove signals. Each component would be identified by a distinct uuid-in-package: URL. They would pass the URL to the turtledove signals into the auction, which would proceed as in (3).

Since we want a solution that handles component auctions and performs well, I think the strongest options are (2) and (4). Of these, since (2) requires the browser to generate public keys and implement a new cryptographic protocol, that pushes strongly in favor of (4).

Aside, on proxy-based attacks: every approach here is somewhat vulnerable to a hostile seller running JS on the page. If the attacker is willing to run a proxy and impersonate the browser they can override the API to create the iframe, the API used to call the ad server, or the runAdAuction API, substituting their own URLs that route through the proxy. Running a proxy attack at any appreciable scale, however, would be highly visible, and the sellers' existing anti-fraud systems would be able to detect and respond. Since the issue here is routine signal leakage, let's set aside these attacks as out of scope.

JensenPaul commented 2 years ago

if the signals are exposed momentarily, then other JS running on the page can read them, so I don't think that works?

If the fetch of the signals happened in an iframe, would that secure them from other scripts on the page? Perhaps the browser could offer something akin to postMessage() to securely feed the signals from the iframe into a FLEDGE auction bidder or seller?

jeffkaufman commented 2 years ago

I think something like that could work, though it does add additional latency relative to (4). I think this would require adding something to the auction config saying that additional signals are expected, so the browser knows to delay starting the auction until the signals arrive? Spitballing an API, the publisher page could call:

navigator.runAdAuction({
  ...
  asyncSignalsToken = "random token",
});

Then whatever iframe the seller configures can run:

navigator.provideAuctionSignals("random token", {
  extraSellerSignals: {...},
  extraPerBuyerSignals: {
     "dsp1 origin": {...},
     "dsp2 origin": {...},
  },
});

These can be called in either order: if runAdAuction goes first it waits for provideAuctionSignals before starting, and vice versa.

One thing I like this API is that it supports a (4)-like flow where the signals are returned as an html resource within a webbundled contextual ad response, minimizing the latency impact.

JensenPaul commented 2 years ago

Can you describe where the additional latency concern comes from? is this due to the iframe requirement?

Could we simplify your API by having the "random token" get returned from provideAuctionSignals() rather than passed in? This precludes runAdAuction() being called first, but I think it makes it simpler and more straightforward to use and implement.

jeffkaufman commented 2 years ago

Can you describe where the additional latency concern comes from? is this due to the iframe requirement?

Yes: creating an iframe and waiting for it to run JS is going to add latency.

Could we simplify your API by having the "random token" get returned from provideAuctionSignals() rather than passed in?

That would work, but it would add even more latency because of the postMessage requirement. It would require a flow like:

  1. Page creates iframe
  2. iframe calls provideAuctionSignals and then postMessage with the token
  3. Page receives token, calls runAdAuction
caraitto commented 2 years ago

Subresource bundles are now in origin trial (M90-M101) in Chrome.

Perhaps a hybrid postMessage / subresource bundles approach might make sense? Basically, if subresource bundles are available (this would be a runtime check during subresource bundles OT), we allow subresource bundle UUIDs in provideAuctionSignals() parameters:

navigator.provideAuctionSignals("random token", {
  extraSellerSignals: {...},  // Or "[Bundle UUID]"
  extraPerBuyerSignals: {
     "dsp1 origin": {...},
     "dsp2 origin": "Bundle UUID",
  },
});

The bundle UUID would resolve to a JSON resource -- the worklet doesn't know or care if the signals came from a bundle or from a JS object. The same behavior of delaying the auction until all signals have been received would apply.

If / when subresource bundles become standardized, I think we wouldn't need provideAuctionSignals() -- we could just provide the UUIDs to runAdAuction(), since the contents of the web bundle shouldn't be loaded into the renderer process, so the cross-origin iframe approach wouldn't be necessary for isolation. But, I think the hybrid approach allows taking advantage of the benefits of subresource bundles when available without requiring everyone to migrate. (Although, IIUC adopting subresource bundles for this purpose doesn't seem too difficult, assuming it's available).

Of course, this approach has more complexity (needing to deal with both subresource bundles and delaying the auction), which would be good to avoid if the benefits aren't necessary.

caraitto commented 2 years ago

Alternatively, another hybrid approach would be to allow extraSellerSignals / extraPerBuyerSignals to be passed to runAdAuction(), but with UUID values. provideAuctionSignals() would still exist, but it'd accept JSON-serializable objects. Then, we could make it an error to specify both asyncSignalsToken and one or more of extraSellerSignals / extraSellerSignals.

caraitto commented 2 years ago

I'm going to try prototyping the subresource bundle portion of the hybrid approach above (extraSellerSignals / extraPerBuyerSignals passed to runAdAuction()). I can follow up with implementing the provideAuctionSignals() side if there's interest.

caraitto commented 2 years ago

@jeffkaufman A minor semantic clarification: I think the ACAO response header from CORS doesn't have the ability to restrict access -- on the contrary, it allows access to a resource like application/json files / subresources that have already been restricted by the same origin policy (and also CORB; CORB will prevent the cross-site JSON from being sent to the renderer process, even if a request is made via fetch(), Githubissues.

  • Githubissues is a development platform for aggregating issues.