WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
526 stars 229 forks source link

Some detailed questions on FLEDGE design #207

Open LichDLC opened 3 years ago

LichDLC commented 3 years ago

During our internal discussion, there are a few open questions regarding the FLEDGE design.

  1. The generate bid request will be sent per interest group or we can merge them as one request to bid server? Trying to understand the potential QPS of bid server.

    Once the trusted bidding signals are fetched, each interest group's bidding function will run, inside a bidding worklet associated with the interest group owner's domain.

  2. Can bid server add extra ads in response? Saying some ads are disqualified or we don't have enough ads based on user interest, can bid server inject more ads in response?

  3. FLEDGE fragment the ad request, is there any place describing what user/publisher signals are available in each request? Such as, is device information or user location information available in:

    • Join interest group request
    • Daily update request
    • Bid server request
appascoe commented 3 years ago

1) In theory this may be possible, but not useful. At least as far as the bring-your-own-server model goes, any request to the server would have to pass a k-anonymity threshold; sending the full list of interest groups would likely be deanonymizing.

2) No. This was briefly discussed in https://github.com/WICG/turtledove/issues/198

3) For joining interest groups, this should happen on the advertiser's site, so that information would be available. The daily update request probably shouldn't allow this, as it could be used for fingerprinting, and the daily update request would allow for privacy-invasive practices such as tracking a user's geographical movements. As for the server request, there's an issue for that: https://github.com/WICG/turtledove/issues/187

LichDLC commented 3 years ago

Thank you Andrew for your reply. I still got some questions.

Does trusted server have any other functionality besides of a key-value server? such as running any logic inside. The reason I'm asking is considering the business rules. Advertiser may only want their ads been shown for some users/publisher which can satisfy some condition. For example, location/age/gender/time targeting, publisher exclusion, etc. I'm concerning the performance on running/transmitting all these data/logic at browser for large amount of ads. Also, we need to return more ads since some information (user or publisher) are missing at the ads selection time.

I have saw similar questions like #14, #96 . Those threads are pretty old, just wondering what's the latest consideration on this or still open?

appascoe commented 3 years ago

1) In the BYOS stage, you own the server and can perform any computation you like. However, your response must only include JSON blobs for each of the keys used to query. Once it moves to a trusted third-party server, this may change. The explainer implies it's a simple key-value store. As far as I can recall, this has not been a major topic of conversation.

2) The conditions you list can all be handled by the bidding.js logic in the interest group. Publisher information will be available in the contextual request and can be passed into the generate_bid() function. You can add more ads as part of the contextual response, but these will not employ any interest group data.

thegreatfatzby commented 1 year ago

Hello fwends, sorry to resurrect something from so long ago but I'm trying to catch up and understand possible implementations.

@appascoe or others, I'm interested if you have any thoughts on @LichDLC 's first question, about being able to run a single generateBid servlet on all of the IGs available for a given domain. In particular, since Privacy Sandbox as a whole has evolved quite a lot since this question, and there are now both On Device and Bidding and Auction Services executions that would run in a TEE, I'd like to understand thoughts on running a single getBid against all IGs for the advertiser in those trusted contexts, and if there is some creative problem solving to be done there to allow for less auction resources and capabilities but to still preserve privacy thresholds.

Thanks!

michaelkleber commented 1 year ago

Hi Isaac: The Chrome engineers working on latency and resource usage have done a great job of reusing the same worklet for multiple Interest Groups with the same owner, based on a mechanism for initializing the worklet context once and then reusing it from that initialized state multiple times. See https://github.com/WICG/turtledove/issues/304 for more on this topic.

But that's about resource usage, not about sharing data. We're still remaining as conservative as feasible when it comes to Protected Audience data sharing: the bidding for a particular ad is based only on (1) information about the user's activity on the site where they were added to the Interest Group, and (2) information from the site the ad is going to appear on. Combining IGs joined across different sites would change that model.

thegreatfatzby commented 1 year ago

Understood on the resource usage part.

W/r/t the capability part: (1) and (2) make sense and do seem right to me, but it would be quite helpful from a capability perspective for the bidding function to be able to access multiple IGs from the same site (I had used the word domain), or to be more specific the set of IGs registered on that site. I'm not including any cross-brand, 3rd party data provider type cases here, or even interesting cross-device-same-"site" stuff (although I am interested in that piece eventually).

I'll put some examples below, but a) based on the docs and my very new forays into Chromium code that isn't doable now but b) I'd think that would still be within the boundaries privacy wise for (1) and (2). I guess I'd have to think if there's some interesting consequence on reporting or other functionality, but the Marginal Reduction to Ad Tech Disruption would be quite a lot, if I'm understanding correctly.

Example I only ever use my one Chrome browser and only ever have 3 tabs open, nytimes, Nike, and Levis. I am now on nytimes reading an article about Meta being fined 1.3 billion dollars for GDPR violations, a gripping tear-jerker, and nytimes uses some ad tech to run an auction to show me an ad. Given it's me, Nike and Levi's would each like to show me an ad, and have previously registered the following interest groups

When the PA phase of the auction kicks off, it seems like the bidding would go something like:

  1. nikeBidFunction.bid(Tennis)
  2. nikeBidFunction.bid(FedererForever)
  3. nikeBidFunction.bid(LikesTieHeadBands)
  4. LevisBidFunction.bid(JeansJackets)
  5. LevisBidFunction.bid(NotTooBrightYellow)

What I'd like to see is:

  1. nikeBidFunction.bid(Tennis, FedererForever, LikesTieHeadBands)
  2. LevisBidFunction.bid(JeansJacket, NotTooBrightYellow)

I am not asking for:

  1. nikeBidFunction.bid(Tennis, FedererForever, LikesTieHeadBands, JeansJacket, NotTooBrightYellow)
  2. LevisBidFunction.bid(Tennis, FedererForever, LikesTieHeadBands, JeansJacket, NotTooBrightYellow)
michaelkleber commented 1 year ago

Got it! Sorry that I initially interpreted your question more broadly than you intended.

We've had a bunch of discussion about this use case, and several ad techs have indicated that their preferred way to use the system would indeed be to create a single IG per user per IG-join-site. So in your example, I might be placed into one IG while visiting the Nike domain: IG "MichaelKleber-Nike-CustomerNumber123456789", with [Tennis, FedererForever, LikesTieHeadBands] all stored in the userBiddingSignals of the IG in whatever way is most useful. When I look at a new product on Nike, instead of adding me to a new IG, the ad tech would call joinAdInterestGroup() for MichaelKleber-CustomerNumber123456789 again, overwriting the old signals with a new expanded set.

This means an ad tech keeping track of the total state for a user-per-site on their own. Partitioned cookies are a natural part of making that work; see CHIPS for infrastructure dedicated to that use case.

We've modified the Protected Audience API in several ways to support that one-IG-per-site goal. If you find other ways in which the API design is preventing you from doing this, please point them out to us. As you said, it should be entirely possible to make this work without compromising the underlying privacy model of the API.

[edit:] The canonical Issue for talking about this is #361

thegreatfatzby commented 1 year ago

Got it, read through 361, I'm still wrapping my head around the totality of stuff here, the creative (ad) registration and rendering is one I'm still working on and I'll probably ask questions about that as well later :)

So it sounds like the recommendation is that in my example the individual items (Tennis, FedererForever, etc) would be thought of as atomic "Interests" from the DSPs perspective, and then the bidding function receives one "Interest Group" that is the union of their "Interests" registered previously on that site, and the bidding function at that point can do as it needs with taking the Interest Group and matching interests to bidding logic?

thegreatfatzby commented 1 year ago

I'll ask separately as it's much different, but I'm curious what discussion there has been about the cross-device-same-"site" case I referenced.

If I'm logged into the same type of user-agent on different devices, perhaps Chrome for instance, and I visit Nike on both but for whatever reason am not logged in on Nike.com, will the "Nike bidding function" be able to access my Interest Groups from both Nike.com visits in either location?

I ask b/c while this would not be doable/tolerable/etc in "ad tech classic", I'm not sure it violates (1) or (2). (For the purposes of this question I'm ignoring how this would happen and only asking about it relates the privacy thinking overall).

michaelkleber commented 1 year ago

So it sounds like the recommendation is that in my example the individual items (Tennis, FedererForever, etc) would be thought of as atomic "Interests" from the DSPs perspective, and then the bidding function receives one "Interest Group" that is the union of their "Interests" registered previously on that site, and the bidding function at that point can do as it needs with taking the Interest Group and matching interests to bidding logic?

This isn't necessarily "the recommendation" — there are lots of ways to use the API, and we hope it is flexible enough to support many flows. But what you describe is indeed an approach that seems to work well in practice, and which we want to support.

If I'm logged into the same type of user-agent on different devices, perhaps Chrome for instance, and I visit Nike on both but for whatever reason am not logged in on Nike.com, will the "Nike bidding function" be able to access my Interest Groups from both Nike.com visits in either location?

We have not worked on the cross-device question for this API. I can think of a bunch of different points of view here, so it seems like an interesting topic for future discussion.

thegreatfatzby commented 1 year ago

"The" vs "a", good point, thanks.

Yes, I can also think of many points of view :) .

What is the right way to proceed here, would I open up an issue for specific discussion, try to get that on the docket for the group talk one week?

michaelkleber commented 1 year ago

Yup! Open a new issue if you want a clean place to describe the gap you're hoping to fill, and then add your topic to the agenda suggestion list for the every-other-week phone call (call logistics and doc link are in #88).