FLEDGE : Answering contextual requests with a generate_bid() function

MarieScibids commented 3 years ago

Hello,

Following FLEDGE call discussions on frequency capping and A/B testing as part of contextual requests, we wanted to come back with a quick proposal on that subject.

We suggest that Ad Network respond to contextual requests with a _generatebid() function, along with a list of pre-selected eligible ads. Similar to FLEDGE, the function will take as input a _browsersignals, containing information that the browser knows like _prevwins to allow on-device frequency capping or information to perform A/B testing. The output of _generatebid() contains the final bid along with the associated ad. We know that something along this line was for instance already proposed in TERN.

Theoretically a scaling issue could arise since following this procedure DSP will be unable to select a priori the “highest bidder” and will have to send multiple ads (amongst the thousands potentially eligible for this request). However we could easily imagine heuristics on the DSP side to perform some sort of pre-selection in order to comply to a maximum number of ads to send: a simplistic one would be to select the ads of the top x campaigns assuming _generatebid() is maximum, and a “fallback” (ad,bid) couple from a campaign whose bid does not depend on _generatebid().

By doing that, advertisers would keep a certain control on frequency/recency capping and optimization and can still perform A/B testing without any risk for the user's privacy as everything is computed on-device.

Do you see any limitation or point that we may miss on this proposal ?

michaelkleber commented 3 years ago

As we discussed during the 2/17 meeting, the key problem here is that if we allow such a generate_bid() function to get access to the on-browser cross-site signals you're asking for, then we cannot allow the publisher page to see the output of the function.

In particular, in FLEDGE we leak the one bit of information of whether or not any ad won the auction. Even that is too much information to leak if the surrounding page gets to create an arbitrary function that determines that bit!

So your proposal seems entirely reasonable as long as the result always renders inside a Fenced Frame. That means that even if the frequency-capped ad decides not to bid because of the cap, and the contextually-targeted fallback ad without any cap ends up the winner, that contextual ad would also be forced to render with all the restrictions imposed by FLEDGE.

Rendering in this special environment — which is going to ultimately require only aggregate reporting — is a new challenge that not all ads will want to undertake. In FLEDGE, only the ads that want to use this particular targeting capability need to do the extra work. Under your proposal, we would need every ad in the auction to be willing to live in this aggregate-reporting world.

lcevans commented 3 years ago

(Speaking as a DSP) This approach appeals to us too (ability to use browser_signals for contextual bids)

But it raises another point:

Currently generate_bid is defined via the Interest Group. Since contextual targeting bids wouldn't be tied to any interest group there would need to be some other way for buyers to pass the generate_bid_contextual function to the browser

michaelkleber commented 3 years ago

Yes, absolutely — this would require some new way for a creative-rendering-url in a contextual ad response to be bundled with an on-device bidding function.

What's there now is a runAdAuction() argument with an additional_bids parameter, which could be a way to hook up contextual responses with the rest of the on-device bidding. But there are a bunch of design details that would need to be worked out, as well as the privacy ones from my previous comment.

RLemonnierScibids commented 3 years ago

If I understand well the constraints induced by the fenced frame and aggregate reporting are not going to prevent the core adtech use-cases (like getting granular-enough reporting to perform ML optimization), since all interest-based advertising is going to run under these constraints anyway.

Thus I think several (a majority of?) DSP/advertisers would consider that depriving prospecting campaigns of key use-cases like frequency capping or A/B testing seems much more annoying. As an example, large consumer-packaged goods advertisers routinely use frequency as a KPI of their branding campaigns and their contribution to the overall programmatic spend is massive.

This additional_bids parameter is very interesting. Could we for instance imagine that:

DSP could choose to answer a contextual ad request with an object containing a generate_bid() function depending on the user-based variables and possibly a report_win() function as well?
the additional_bids vector would become an additional_bidding_functions vector. These functions would be evaluated at time of the auction and compete against the interest-group JS bidding functions.

Thus, no data would be sent to DSPs.

We would therefore have:

the contextual auction where DSPs can either answer with a single bid, or a generate_bid() function, in which case their winning creative would render inside the fenced frame if it wins.
the on-device auction : where we would have all the bids from the allowed interest groups JS bidding functions and all the bids from the contextual JS bidding functions fetched from the “additional_bidding_functions”.

Do you see any privacy issue with this proposal?

michaelkleber commented 3 years ago

@RLemonnierScibids This does seem like a direction that FLEDGE could evolve in. But as I said before, the trade-off in ease of use for buyers who don't want the on-device bid adjustment features seems quite real to me.

RLemonnierScibids commented 3 years ago

Thanks for the feedback!

Just to be clear: our proposal is that for each bid request, the buyer will choose:

either to participate in the contextual ad auction like in the current state of proposal by submitting only a bid.
or to participate by submitting a generate_bid() function and the necessary informations for the on-device auction and fenced frame rendering

So I am not sure I see the trade-off here, since this would basically just provide a 2nd way to the buyers to participate in the contextual auction. The buyers you mention would just have to never use this new possibility.

Am I missing something?

michaelkleber commented 3 years ago

As I mentioned back in https://github.com/WICG/turtledove/issues/116#issuecomment-798991125, the key trade-off here is that once it's possible for a contextually-targeted ad to access on-device information (e.g. in a generate_bid() call), we need the winner of the on-device auction to always render inside a Fenced Frame — even if the winning ad is a different contextually-targeted ad that didn't use any on-device bidding information.

RLemonnierScibids commented 3 years ago

Ok thanks for clarifying again.

We were considering this in the buyer’s seat which in this case wouldn’t have access to this additional bit of info since DSP would have to choose between the 2 ways to participate in the contextual auction.

Regarding the publisher, if I understand well your concern is that it would have access:

to the contextual infos of the auction
to the highest contextual bid
to the generate_bid_contextual() functions which could depend on user-based variables.

Taking an extreme example of a single generate_bid_contextual() function

def generate_bid_contextual([...]):
    if user in interest_group_123 :
    return 2
    else:
        return 0

and a highest contextual bid of 1, the publisher could learn that the user is in interest_group_123 or not through its observation of whether the on-device auction produces a winner or not. Is that correct?

If that’s the case we think this issue could be alleviated since the publisher doesn't need at all to observe the generate_bid_contextual() functions in clear, and is basically just passing them to the browser which should be the entity to interpret them. For instance could we imagine the following encryption architecture:

at time of the auction a couple (private_key,public_key) is generated
the public key is transmitted to each DSP in order to encrypt their generate_bid_contextual() function
the browser decodes the generate_bid_contextual() functions during the on-device auction

RLemonnierScibids commented 3 years ago

@michaelkleber Since we ran out of time before reaching this agenda item in the call list, do you think you would have time to answer our last comment before the next call in 2 weeks? This feature still seems a very important factor of whether DSP will be able to implement acquisition campaigns or not. Or else I would love the opportunity to put this item at the top of the list in two weeks :)

michaelkleber commented 3 years ago

Hello @RLemonnierScibids, sorry that we didn't get to this during the call.

I don't think your idea of an encrypted contextual bidding function helps here. It would indeed provide a way for the DSP to modify its bid without the publisher site knowing that it did so, if the DSP wanted to hide that information. But if the DSP wanted to share that information, then offering an encrypted channel wouldn't particularly help; surely the DSP could communicate its intended logic to the publisher in some other way.

In essence, one FLEDGE privacy goal is that the publisher not learn interest group memberships even if the IG owner would be willing to share.

RLemonnierScibids commented 3 years ago

Hi @michaelkleber, thanks for clarifying!

We are currently studying potential solutions but first we would like to get a precise understanding on what exactly is the case you are trying to avoid.

Would you be able to give details about your assessment “in FLEDGE we leak the one bit of information of whether or not any ad won the auction. Even that is too much information to leak if the surrounding page gets to create an arbitrary function that determines that bit”, and how would this setting allow to leak more info than in the current setup?

At the moment we consider the following:

A dsp DSP_1 whose bidding function would be:

def generate_bid_contextual():
    if website == cnn.com and frequency == 0:
    bid 1
    else:
    bid 0

A SSP whose score_ad function would be:

def score_ad():
     if DSP==DSP_1 and bid > 0:
    DSP_1 wins
    else:
    no winner for the on-device auction (contextual wins)

Assuming DSP, SSP and publisher collude with each other, the publisher will know each time the on-device auction wins:

the context c
the user has a frequency of 0

thus getting a bit of info on past behaviors in addition to the current context of the webpage.

Now I am not sure I understand how this is different from what could happen in the current state of proposal if the DSP defines all its interest group generate_bid() functions exactly as the generate_bid_contextual() function above, and the SSP defines its score_ad() function as above.

In both cases, it seems that in the unrealistic case of colluding actors ready to waste vast amounts of money to get one additional bit of info on some users it would be possible.

From our point of view this raises the question of what the privacy metric should be: should we reject an important use-case for the industry if there is a theoretical possibility that colluding actors might acquire a very expensive bit of info on a given user?

Or could we rather compute a “cost of adversary success” metric (in the philosophy of “Time until adversary’s success” described here) to see if these attacks would make economical sense or not?

RLemonnierScibids commented 3 years ago

Thanks a lot @michaelkleber for our discussion on the topic on Wednesday!

If I summarize:

we agree that the privacy attack outlined above is theoretically possible even in the current FLEDGE proposal, but this would include collision from multiple actors plus a suspicious behavior from the SSP (not to mention the very high cost in lost revenue opportunities) which the browser could probably detect with the right procedures
you mentioned this as less problematic than the fact that “the surrounding page could learn an arbitrary bit about the user” is the general case of a generate_bid_contextual() function.

The last point is where I still need clarification, since I am not able to see which other privacy attack you are referring to. Could you explain for instance how a publisher would learn “the 7th bit of a given user_id” other than with the procedure explained above?

RLemonnierScibids commented 3 years ago

hi @michaelkleber! I couldn't be there last FLEDGE call but still very interested to discuss this topic.

If you could formulate the privacy attack you identify that would be possible with generate_bid_contextual() but not with the current generate_bid() setup for interest groups we are keen to see what limitations we can think of in order to keep the same level of risk between the 2 options.

Thanks a lot!

WICG / turtledove

FLEDGE : Answering contextual requests with a generate_bid() function #116