privacysandbox / protected-auction-key-value-service

Protected Auction Key/Value Service
Apache License 2.0
55 stars 20 forks source link

Seeking feedback on propagating contextual signals to TEE KV server #72

Open peiwenhu opened 3 weeks ago

peiwenhu commented 3 weeks ago

Note: This issue describes possible new functionality for Protected Audience. It is not necessarily a feature that will ship in the near future, but we are considering it as an optional backwards-compatible potential API improvement. We are interested in hearing feedback with this post.

Context

This post seeks feedback for the concept of propagating contextual signals as part of the TKV (Trusted Key Value server) request. This post will mostly explain in the context of on-device auctions but we also consider the same capabilities for the Bidding & Auction services.

Introduction

Today the SSP can provide contextual information to PA auctions by specifying perBuyerSignals and sellerSignals in the Auction Config. The content of the signals can come from various sources including the contextual auction responses and information on the page. But these signal fields are only available to generateBid() or scoreAd(), not the key value service. There is only very limited contextual information sent in the TKV request, such as hostname and ad slot size.

This poses some challenges to the design for certain use cases:

The list is not exhaustive.

The flow

Contextual signals Chrome with full TEE (6)

Diagram: example flow related to the DSP KV server

In this post we envision 2 new fields in AuctionConfig:

This allows the SSP to specify additional contextual information to the TKV server.

This information is only available in the request to the TEE-based implementation of the KV server. BYOS KV servers do not have access to this information.

Parallelization compatibility

Parallelization is an important performance optimization today where the contextual ad auction happens in parallel to other PA auction actions, particularly the TKV trusted bidding signal request.

Including perBuyerTKVSignals to TKV request can impact the parallelization: If the signals depend on the contextual auction response, the TKV request will need to wait for the contextual auction to finish, forgoing the parallelization feature. On the other hand, if perBuyerTKVSignals contains just information available on the page, such as the full page URL, the parallelization can still be effective.

The SSPs and DSPs need to make a tradeoff between having more signals and keeping parallelization effective.

There are a few potential mechanisms to isolate the tradeoff to be under each DSP’s own control without impacting other buyers or the auction overall. Ideas include that the SSP creates two component auctions. One auction would include DSPs who prioritize parallelization and rely solely on on-device contextual signals. The other would be for DSPs willing to await the contextual auction results before sending their TKV requests. Feedback on this approach is crucial. Further discussions will help refine this strategy and explore different configuration ideas, such as consolidating the two auctions into a single, unified one.

Server side processing

The signals can be available to the UDF running inside TKV, similar to hostname and ad slot size as metadata. The rest of the flow is the same as today.

The new signals could enable more capabilities to happen within the TKV. A notable example is brand safety where the DSP TKV can perform some verification on the full publisher site URL to provide extra decision making input. We will describe more about this example in another feedback-seeking post.

Since this information is only available to TEE KV server, today an adtech cannot make use of this unless they migrate from BYOS to TEE KV server. We are also interested in getting feedback about an intermediate setup before TEE KV enforcement. The setup is considered a “Hybrid mode” where an adtech can deploy both a BYOS system and a TEE KV system that collaborate to complete a request. So some logic can stay in the BYOS stack and other logic can be migrated to the TEE KV system first. More details will be provided in a separate issue.

Bidding & Auction service flow

The device can also pass these fields to the Bidding & Auction services. In addition, for DSPs, the Bidding & Auction service can pass the perBuyerSignals to the TKV. There are no parallelization concerns as the contextual ad auction completes before the Bidding & Auction flow starts.

davidae commented 3 weeks ago

I'm trying to understand the core problem/challenge this proposed feature addresses, and I'm not sure I can capture it.

The new signals could enable more capabilities to happen within the TKV. A notable example is brand safety where the DSP TKV can perform some verification on the full publisher site URL to provide extra decision making input. We will describe more about this example in another feedback-seeking post.

I would guess this part is the most relevant for this?

But these signal fields are only available to generateBid() or scoreAd(), not the key value service.

Is this a problem?

peiwenhu commented 2 weeks ago

Good point @davidae . I updated the doc. Please see the introduction section

droundy commented 2 weeks ago

I'd be interested in knowing whether it would be possible to create within the TEE KV system a separate "contextual" processor that generates perBuyerSignals, so a DSP could skip the contextual auction and generate within the TKV a perBuyerSignals that is known not to use first-party input, which could therefore be included in the win report.

This seems like it could be a huge cost saver for smaller DSPs focusing on retargeting, which have interest groups on only a small fraction of browsers, so that they don't need to handle contextual requests for vastly more auctions than they actually participate in. In essence this would enable "pre-targeting" since they'd only have to handle auctions for browsers where they have an interest group.

MattMenke2 commented 2 weeks ago

I'd be interested in knowing whether it would be possible to create within the TEE KV system a separate "contextual" processor that generates perBuyerSignals, so a DSP could skip the contextual auction and generate within the TKV a perBuyerSignals that is known not to use first-party input, which could therefore be included in the win report.

It's currently actually the other way around - the signals for the win report (that is, a report sent from reportWin()) can't have too much information from the site that joined the interest groups (contextual signals from joining page / signals from within the IG itself). It's allowed to have full information from the page that ran the auction. There are any number of ways to pass information about the page that ran the auction to reportWin(). reportResult() gets auctionSignals, perBuyerSignals, directFromSellerSignals, and sellerSignals, all of which can contain full context about the page running the auction.

Hopefully this constraint can be relaxed once we remove event-level reporting, in favor of aggregated reporting, but that won't be happening any time soon, I believe.

peiwenhu commented 2 weeks ago

Matt IIUC I don't think the question breaks the constraint you mentioned though as it does suggest to "not to use first-party input".

MattMenke2 commented 2 weeks ago

Matt IIUC I don't think the question breaks the constraint you mentioned though as it does suggest to "not to use first-party input".

The suggestion was for the field to be included in the win report. I assume this was talking about an event-level win report, which is also able to include contextual data.

droundy commented 2 weeks ago

Matt IIUC I don't think the question breaks the constraint you mentioned though as it does suggest to "not to use first-party input".

The suggestion was for the field to be included in the win report. I assume this was talking about an event-level win report, which is also able to include contextual data.

I am talking about the event-level win report, but an not talking about first-party data.

The proposal that this issue relates to involves sending contextual data to the KV store. My ask was that there be a way to provide precisely that contextual data (without any first-party data, but with DSP-specific manipulation of that data, e.g. to apply some machine learning model to that contextual data and log the output) to the event report. We can (and do) already do this, but it requires that we participate in a contextual auction, which is an expensive waste of resources when most contextual auctions are for browsers where we have no interest group. If we could stop participating in contextual auctions our costs would be proportional to the number of auctions that we participate in, which would be a huge savings.

To summarize the summary: the idea is that if the KV store were to allow us to separate the "digest the contextual data and turn it into input to our bidder" process from the "look at all available data and bid" then the output of the former (non-first-party) code would be safe to log in the win report, which would be very beneficial for smaller DSPs focusing on retargeting.

MattMenke2 commented 2 weeks ago

Ah, so by "is known not to use first-party input" you mean first-party data relative to the origin the IG was joined on (Or data stored in the IG about the user derived from that context), rather than information from the origin running the auction. That sounds reasonable.

fhoering commented 2 days ago

Thanks for envisaging this feature which would unlock some interesting use cases like being able to filter more information based on the full publisher url or doing some supply side optimization based on the sellers that are active for the current auction.

We think that this feature would be mostly useful for on-device auctions and inject more signals to the server side computing part inside the KV server. For server side auctions (B&A) it seems like the generateBid layer already gives access to the IG and to perBuyerSignals and therefore provides enough information to get access to those signals server side. Not sure it is necessary in this case to send more signals to the trusted KV Server.

For on-device auctions it seems important not to break the parallelization feature because it would slow down the auction flow too much. So it looks like perBuyerSignals should probably be one promise for each buyer instead of one promise for all buyers in order not to slow down buyers that don’t want to wait for the contextual call to resolve perBuyerSignals. In our case the best would be to inject values that are already available during auction time like the seller domain, the publisher url, ..

Today the KV service call looks like an expensive call already. One down side might be that injecting contextual signals might break the fact that the KV service calls is shared across supply side channels because each seller might setup different signals. It could potentially behave similar to the feature to split up KV calls maxTrustedBiddingSignalsURLLength, like grouping all seller information inside one single KV call instead of sending multiple calls. As it is not straightforward to know if multiple calls or 1 call are needed, probably better to provide some configurability here and maybe also let the DSP control what they want to receive.

With those new signals it also looks very important to be able to execute the inference layer in the KV service also (protected-auction-services-docs/inference_overview.md at main · privacysandbox/protected-auction-services-docs ) because one key feature that could be executed by the KV service would be ML inference.

Did you consider sending more user data to the KVS like the information from prevWins ? Today the KVS handles advertiser user information. So it looks like one could also send other forms of user information like previous wins and clicks. This kind of information doesn't have the downside to be SSP specific (like perBuyerSignals) which means one KVS call could be reused across several SSPs.