WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
529 stars 232 forks source link

Multi-Tag Auction: R & D #846

Open thegreatfatzby opened 1 year ago

thegreatfatzby commented 1 year ago

Multi Slot

So I've been chewing and wondering about accommodating multi-tag auctions, meaning in which a "single auction", I suppose more accurately a single call, results in multiple ads appearing on the page, in a coordinated way. Many, most even, auctions occur this way. It has numerous benefits:

In the current Fledge design we have to run N independent auctions for N slots, and can't really accomplish the features mentioned above.

Let's Chew

I'd like to tease out the privacy and utility implications of allowing a multi-slot auction to occur in a Fledge'y world.

Simplifying Assumptions

So to simplify I thought I'd start with some wrong assumptions:

Privacy Tools

Questions to Ask

I'm trying to figure out how you assess these things, some questions I've thought about include:

  1. Does this have a current analogue?
  2. How many Domains are going into a box?
  3. How many Domains can be included in what comes out of a box?
  4. Are "safe" (known k-anon) and "unsafe" (unknown k-ness) values ever in the same box?

Ideas to Assess

Let...

N be the number of renderUrls you want to show. B be the set of all bids. U = Utility P = Privacy Current Fledge Utility = u Current Fledge Privacy = p

So not all of these are useful, but I want to tease out privacy thinking.

1. Uncoordinated TopN: Multiple K-Anon renderUrls Returned, scoreAd is still single domain

So in the first scenario, rather than returning just the top 1 by score, you'd return the top N, perhaps with reasonable limits similar to what is done with ad components, say 20 and that you must declare and get the same N back. Each of the top N must still pass K, bid values still subject to entropy reduction, etc. If N winning-and-K renderUrls can't be found, return default creative indicated somehow (auctionConfig something something).

  1. This seems like it would be similar to just running N separate auctions.
  2. Single Domain
  3. Single Domain
  4. Only safe.

U = 1.01u. P = 0.99p?

2. Coordinated Mixed-Safe/Unsafe TopN: ^ but scoreAds gets list of all bids

So let's go a bit further, let's say we replaced scoreAd with a naive tweak. scoreAds(B, N) --> L: replace scoreAd that takes each bid with scoreAds which gets the list of all bids, and then returns the top N renderUrls exactly as passed, which must each pass K-Anon, and be returned opaquely (FFConfig or opaque URN).

  1. This does change the model and so I don't think I can compare to anything now.
  2. Multiple domains.
  3. Single domain.
  4. Mixed safe/unsafe

The renderUrls are still passing the same k tests, but because you are mixing safe and unsafe inputs mayyybe you could do something clever with the calculations, so that the "safe" outputs are actually revealing something about the unsafe set?

U = 4u, P = 0.7p?

3. Coordinated Safe TopN: ^ but scoreAds gets list of only K-Anon Bids

So same as above but inputs to the opaque function would be "safe", so even if you figure out a way to have the output say something clever about the rest of the input, it can only be saying it about things that are safe. This would be tough because you'd have to have N already hit k to even assess.

U = 1.2u. P = 0.9p?

4. "Selection Sort" Safe TopN: scoreAd called iteratively to return top 1...N, and then that feeds next round

Given the full list of bids run the following process:

  1. Start with a list L = [] that will store the top N winners and B = [...] all the bids.
  2. Run scoreAd(b, L) for each b in B.
  3. The highest scoring b that passes k-anon is added to L and removed from B.
  4. If sizeof(L) < N and sizeof(B) > 0, back to 2.
  5. Fill L as needed.

This is inefficient but would let you only ever assess using safe inputs (i.e. no mix of safe and unsafe) but with less k-impact upfront.

  1. New
  2. Multiple domains.
  3. Single domain.
  4. Safe only

U = 4u. P = 0.9P? (Cost = 2c).

5. Randomize to 2 Groups, Coordinated Mixed Top(N/2) From Each

Split B into 2 groups B1 and B2, run scoreAds(B1, n/2) and scoreAds(B2, n/2), merge, fill as needed.

  1. New
  2. Multiple domains.
  3. Single domain.
  4. Mixed but with non-determinism so hard to reliably be clever?

U = 3u. P = 0.8p?

6. Add Noise to B, Coordinated Mixed TopN

Add (0.1 * sizeof(B)) (so 10%) noise fake bids to B, system keeps track. Run coordinated against everything, any noised (fake) bids returned are converted to default creative.

  1. New
  2. Multiple domains.
  3. Single domain.
  4. Mixed but with non-determinism so hard to reliably be clever?

"Conclusion"

So I'm interested in your thoughts on any/all of these, but I guess it would come down to a few things:

thegreatfatzby commented 1 year ago

Still would like to talk with you about some of the options listed as my brain is still not wrapped around this, but will just reference #211 and #98 here, in particular curious if this direction was ever explored to at least limit to the one bit rather numAuctions bits.

thegreatfatzby commented 1 year ago

Heh, @michaelkleber based on the conversation on #825 I wonder if you'd agree with this statement: if the 1-bit leak was solved, and rendering can't release any bits, multi tag auctions would be fine, because at that point you're leaking Nx0 bits, which is 0, rather than Nx1 bits, which is N, and N could be large.