WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
538 stars 237 forks source link

Support more than one ad candidate and bid from generateBid #595

Open gianni-99 opened 1 year ago

gianni-99 commented 1 year ago

Introduction

GenerateBid is set to return a single candidate ad and bid for each IG. With the introduction B&A serving, k-anonymity checks, this limitation could become problematic. We propose to return more than one candidate ad per IG from GenerateBid and have those candidates scored in scoreAd.

Limitations of current approach

In the current api spec, generateBid returns a single candidate per IG while the scoring of the other ads in the IG is discarded (link). The information discarded could be instead used to optimize some of the follow up computation in FLEDGE.

1: K-anonymity - on device

The k-anonymity enforcement prescribes that if a non k-anonymous ad wins the auction, we need to run a second auction with only k anonymous ads. By returning more than one ad per generate bid, the k-anonymity second auction could skip the generate bid part and only keep those ads that are k-anonymous. Related concerns have been raised in github issue 592, and this proposal addresses the issue of “auction rewind” for k-anon.

2: K-anonymity - B&A

The same applies to the B&A implementation. By allowing more than one render url to be returned by the generateBid, it is possible to avoid repeating computation to get a k anonymous ad.

3: Coverage loss

Limiting 1 ad per IG to participate in the auction, we are penalizing IGs with a large number of ads vs smaller IGs. The scenario could be a user with 2 IGs: one large (call it IG1) and 1 small IG (IG2). After generateBid, the AD1 and AD2 respectively go to SSP and AD1 is rejected. AD2 serves. It is likely that in this case IG1 could have a few ads in “reserve” that are more valuable than AD2 but never get considerate for auction. This mechanism could steer DSPs towards IG proliferation to get more ads in the auction.

Proposed changes

1: Return 2+ (or all) ad candidates per IG

We propose for generateBid to return a priority-ordered list of ads to be considered for the auction, potentially returning all ads it validated and scored. On device, each of the ads will have a known k-anon status while on B&A the k-anon status needs to be queried.

A subset of these ads then run in SSP scoreAd. The first ad and the first k-anonymous ad in the list need to be passed to scoreAd per IG. If the first ad in the list is k-anonymous then only one ad needs to be passed to scoreAd to reproduce the current k-anonymity auctions. However, we propose passing additional ads to scoreAd to address the coverage loss from large IGs.

2: Establish a cap for ads

It is possible that the total number of ads generated per owner is too large to be handled by SSP/k-anon servers so the runAdAuction could impose some restrictions on the number of ads that can participate in the server side auction.

One possible solution is to have a cap for each IG owner on the sorted list of ads (by bid). As a result the most valuable ads will be considered in the SSP score ad.

This limit could be specified in auctionConfig.

3: Auction rewind

If a non k-anon ad wins the auction, it is a matter of choosing the first ad in the sell side auction that is k-anon as a winner.

Summary of asks:

  1. Make generate bid return more than one ad per IG
  2. Run more than one ad per IG in SSP score ad
  3. Reuse the already computed ads to enforce k-anonymity
thegreatfatzby commented 1 year ago

I would +1 the overall ask here, I'd need to re-read a few times to comment deeply on the specifics.

With some of the initial thoughts we're discussing w/r/t how to handle bidding, ranking, and syncing of buy/sell/deal objects all together, a single advertiser having the bid function for their site and IG be able to return multiple bids with creatives would be very valuable. In our current SSP/DSP interactions I don't think it's uncommon for a buyer to return multiple bids for ranking/evaluation by the SSP. Being able to encapsulate the bidding logic for an advertiser in one invocation of a generic bidding function, return 0-or-more-bids, and then have some equivalent of our SSP code do the ranking, is where our first thoughts have gone.

EDIT: It would also allow for more API usage flexibility, and (from the meeting just to put on paper) some limit, either browser set or set by the SSP per buyer/advertiser/etc could prevent sprawl.

thegreatfatzby commented 1 year ago

A note here for discussion: with the general trend towards putting everything into a single IG per (owner, domain), or at least minimizing, one interesting side effect is that if we continue to enforce one bid per call to generateBid, it can hide cases where the second highest bid might have come from the same bidding function, or more generally from the same buyer. I do not know how common this is but we do see multiple bids from the same advertiser in our landscape logs.

morlovich commented 9 months ago

I am unclear as to how this would interact with component ads --- right now the re-run also reduces those... @michaelkleber Maybe you have some suggestions?

michaelkleber commented 9 months ago

Ow, that is hairy. I do not have any good ideas for a new way to handle component k-anon (without an exponentially large number of bids).

If anyone can think of a better answer than the current one (rerunning generateBid() with only the k-anon components), please suggest something.

(Just to be clear, we cannot let generateBid() know which components are over the k-anon threshold and which are not: If we did that, then an advertiser could just bid infinity dollars with an ad using a bunch of not-yet-k-anon components, safe in the knowledge that the ad would never show.)

morlovich commented 9 months ago

My initial thought was to say you have to pick one: multiple bids at once or components.

Re-running... Well, the question is, when. When no bids are k-anon seems like a bad choice since maybe the tenth bid is k-anon but pretty low value, and the 9 before it are only non-k-anon because of a single component ad? Always seems too wasteful?

Paul had some vague ideas about telling about k-anonymity with possibility of lies, but it's pretty hard to reason about.

michaelkleber commented 9 months ago

Everything I've thought of based on noised/lying k-anon values ends up allowing tracking or allowing k-anon circumvention (or both).

I could live with "either multiple bids or components, pick one". I can't prove it's best, but it's certainly the two options that are easiest to think about.

Hey @jonasz and @leobierent12, you have both spend a bunch of time thinking about the use of multiple components in an ad. Are you content with doing your whole multi-component optimization thing twice, once based on all your components (which then can't win if any components are under k-anon) and a second time based on just the k-anon components? That is, does the status quo work for you? We can't think of a way to land "make many bids all at once" in the component context, and if you don't feel like you need both features at the same time, then we will go with a one-or-the-other approach.

jonasz commented 9 months ago

Thanks for the ping, Michael, this issue is very interesting. "Coverage loss" sounds like something we'd like to optimize. Additionally, avoiding the rerunning of generateBid may a worthwhile goal in itself.

I have some ideas, but please let me start with two naive questions, to make sure I'm on the same page.

I was wondering, are we working under the assumption that rerunning generateBid significantly increases the latency of showing the ad? (Do we have to setup a fresh worklet on device? Do we need an additional roundtrip to the server in B&A?)

I do not have any good ideas for a new way to handle component k-anon (without an exponentially large number of bids).

Why would the number of bids grow exponentially? As a thought experiment, let's say we always, unconditionally, call generateBid twice: once with all adComponents, and once with just the k-anononymous adComponents. Each call could then return a list of bids (that is, tuples: (bidValue, ad, adComponents)) and only those bids for whom both the ad and all the adComponents pass k-anonimity can be shown. Would that work? (Plus the usual logic of bumping k-anon counters if the winning bid doesn't pass k-anonimity yet.)

morlovich commented 9 months ago

Thanks for the ping, Michael, this issue is very interesting. "Coverage loss" sounds like something we'd like to optimize. Additionally, avoiding the rerunning of generateBid may a worthwhile goal in itself.

I have some ideas, but please let me start with two naive questions, to make sure I'm on the same page.

I was wondering, are we working under the assumption that rerunning generateBid significantly increases the latency of showing the ad? (Do we have to setup a fresh worklet on device? Do we need an additional roundtrip to the server in B&A?)

We do not need to setup a fresh worklet on device, and we actually reuse the context, so it's cheaper than a separate generateBid call, but if your generateBid takes a while it may still add up.

I don't know what the story with k-anon is on B&A

I do not have any good ideas for a new way to handle component k-anon (without an exponentially large number of bids).

Why would the number of bids grow exponentially? As a thought experiment, let's say we always, unconditionally, call generateBid twice: once with all adComponents, and once with just the k-anononymous adComponents. Each call could then return a list of bids (that is, tuples: (bidValue, ad, adComponents)) and only those bids for whom both the ad and all the adComponents pass k-anonimity can be shown. Would that work? (Plus the usual logic of bumping k-anon counters if the winning bid doesn't pass k-anonimity yet.)

So the concern is that if you return something with, say, 40 component ads, it may be very hard for all of them to be k-anonymous at once, and so you if you may end up having something that's really marginal and is way down your list picked instead of something that was great except it failed the check for 1 out of 40 ads, and where you would have just decided to leave one box empty otherwise if the fallback re-run happened.

Of course, the right answer may be that we don't have to deal with this scenarios in the implementation, the industry can just not use multi-bid if it's not suitable for such a scenario.

(The current prototype re-runs and lets it offer one more bid if no returned bid is k-anon).

I also started on the explainer changes in https://github.com/WICG/turtledove/pull/1048

jonasz commented 9 months ago

So the concern is that if you return something with, say, 40 component ads, it may be very hard for all of them to be k-anonymous at once, and so you if you may end up having something that's really marginal and is way down your list picked instead of something that was great except it failed the check for 1 out of 40 ads, and where you would have just decided to leave one box empty otherwise if the fallback re-run happened.

In that case, though, wouldn't it be enough to rerun generateBid just once? (And this way get a new set of bids, each with all adComponents passing k-anon.)

michaelkleber commented 8 months ago

In that case, though, wouldn't it be enough to rerun generateBid just once? (And this way get a new set of bids, each with all adComponents passing k-anon.)

Right! That's exactly how we do it today. But for people whose generateBid already calculates a bid for every ad, this means we might make them do nearly 2x the work.

With this proposed change, if the first run of generateBid already produces one bid for each possible ad in the IG, then there is no need to rerun — we would be able to know both the top bid and the top k-anon bid after invoking generateBid a single time.

With lots of component ads, it's infeasible for a single call to generateBid to produce bids for every possible thing that the IG might emit. So in that case it seems inevitable that we may need to use the re-running approach.

jonasz commented 8 months ago

With lots of component ads, it's infeasible for a single call to generateBid to produce bids for every possible thing that the IG might emit. So in that case it seems inevitable that we may need to use the re-running approach.

Ah, now I see what you meant by "exponentially large number of bids". Thanks for the details, I think we're on the same page now.

With this proposed change, if the first run of generateBid already produces one bid for each possible ad in the IG, then there is no need to rerun — we would be able to know both the top bid and the top k-anon bid after invoking generateBid a single time.

So - it seems avoiding a single rerun is our goal. I'm assuming we're optimizing CPU load and/or latency.

I'd like to propose an approach that allows for multiBid with adComponents, and mitigates the risk of dropping a bid because a single adComponent is not k-anonymous: let the buyer specify, for each bid, which adComponents are required, and how many adComponents are needed to display the ad. In other words: "here's a list of 40 adComponents; please give me top 15 that pass k-anon, but only if adComponents 1, 2, and 5 are present on that list, and I'm happy to render the ad". Out of the list of the bids returned by generateBid, the first one that passes such requirements would be rendered.

We think this would work well for us: it would give us flexibility to optimize our adComponent selection, optimize the coverage as described in the original post, and also allow us to avoid reruns.

Of course this needs some more work, for example specifying which k-anon counters to bump. Please let me know if the high level idea makes sense, and if it sounds good to you.

(As a side note, as @maciejkowalczyk recently pointed out to me, rerunning generateBid in B&A seems tricky in the presence of payload optimization option 2 - in this approach the list of adComponents is not available to generateBid. I was wondering, did you have a plan to support both rerunning and payload optimization?)

michaelkleber commented 8 months ago

I'd like to propose an approach that allows for multiBid with adComponents, and mitigates the risk of dropping a bid because a single adComponent is not k-anonymous: let the buyer specify, for each bid, which adComponents are required, and how many adComponents are needed to display the ad. In other words: "here's a list of 40 adComponents; please give me top 15 that pass k-anon, but only if adComponents 1, 2, and 5 are present on that list, and I'm happy to render the ad". Out of the list of the bids returned by generateBid, the first one that passes such requirements would be rendered.

Ahh, that is an interesting alternative approach. I guess this makes sense if the bid value itself is not extremely sensitive to the precise choice of components that are going to show; I hadn't considered that possibility.

@morlovich What do you think of this? We would need some modification of the returned object, of course. Just as an example, we could have adComponents with the same meaning as now (the "required" ad components) and padded out with some number of copies of the empty string, to indicate the intended number of components; and then a new field additionalAdComponentsIfKanon with a list to be k-anon-filtered and then used to fill in those empty strings. (Or a whole separate field to indicate the intended total number of components, instead of packing it into the existing array, whichever...)

morlovich commented 8 months ago

I don't think we need to anything too fancy; the bid object can get a acceptIfHasAtLeastNAnonComponentAds or something field; and the component can have another field for required (requiring using full object syntax rather than just-URL shorthand).

I am less clear on what to do with the components that get "thrown out" because they're not k-anon and how to give them a chance to bump the number; the immediate option is to basically split the bid into two, k-anon one and non-k-anon one (will be need to run scoring twice)?

michaelkleber commented 8 months ago

Recall that the bid needs to specify the exact number of components we pass along (which the bidder needs to make sure matches the expectation of the top-level container ad), not just an "at least" value.

For k-anon purposes, I think this would just be treated the same as main bid = "the first N components" and fallback bid = "the first N k-anon components".

jonasz commented 8 months ago

Thank you for the quick response!

Some more thoughts:

morlovich commented 8 months ago

The way I am leaning is more:

  1. If multibid is off, you can't return arrays; the new fields are absolutely ignored (woops, almost implemented this wrong!); presence of browserSignals.multiBidLimit is feature detection for both.
  2. If multibid is on:
    1. Return of a single bid is equivalent to a return of an array with one bid.
    2. If any bid has the new field, perform the component list shrinkage as an alternative to a pure k-anon test, generating its evil twin to participate in non-k-anon auction as well. (Should that be the other components? All the components? The first preferred expected number of components? The last one actually feels most right to me?)
    3. If no bids produced anything usable for k-anon, either by straight up bidding or via messing with the components list, do a filtered fallback run.

This is a bit less compatible but a bit more consistent?

Hmm, should we reduce the number of component ads to the limit if k-anon is off as well? This feels more predictable/easier to work with.

(There is also some semantic space around what happens e.g. when you set your 40th component mandatory and require exactly 15 components, though this seems a bit towards "this is a silly thing to do when you're supposedly ordering by importance, so it doesn't really matter, pick something and document it").

morlovich commented 8 months ago

Hmm, would changing the "this is mandatory" bool to "the first N components ads are mandatory" work?

(Also I think if the "pick M component ads for me" feature is set, we should probably permit returning more than the usual limit on component ads, so long as M fits within that limit)

morlovich commented 8 months ago

Got one more semantics question: do you want it more like: 1) "I give you 100 component ads, please try to give me exactly 14 k-anonymous ones" 2) "I give you 100 component ads, please try to give me at least 14 k-anonymous ones (but more is fine; capped by the global limit of 20/40 depending on state of that experiment's roll out). 3) Something else?

jonasz commented 8 months ago

This is a bit less compatible but a bit more consistent?

What do you mean by less compatible? What I think is important: let's say the changes are implemented in Chrome, but the buyer has not adapted the generateBid in any way yet - will generateBid continue to work, with a potential rerun?

Hmm, would changing the "this is mandatory" bool to "the first N components ads are mandatory" work?

Yes, if this is simpler to implement, this sounds good to me.

(Also I think if the "pick M component ads for me" feature is set, we should probably permit returning more than the usual limit on component ads, so long as M fits within that limit)

Sounds great!

As to semantics: 1 & 2 sound good. We don't mind getting more adComponents. (I'm assuming the ordering of adComponents will be retained, as returned from generateBid.)

morlovich commented 8 months ago

What do you mean by less compatible? What I think is important: let's say the changes are implemented in Chrome, but the buyer has not adapted the generateBid in any way yet - will generateBid continue to work, with a potential rerun?

Generally yes: unless it coincidentally happened to have been setting the new properties on its bids. It's less compatible in this sort of pedantic silliness way: it looks at the new property even if the bid is not inside an array (or another sequence).

morlovich commented 8 months ago

Updated the explainer pull request in https://github.com/WICG/turtledove/pull/1048/ to current line of thinking. (spec update will be... later... since it's considerably more complex in parts)

jonasz commented 8 months ago

Thank you @morlovich , the spec LGTM!

morlovich commented 7 months ago

This is available in an eventually-going-to-be-50% (it takes a while for people to pick up configs, etc) trial in canary/dev...

rdgordon-index commented 6 months ago

So, there's support for multi-bid for component ads, but nothing more, correct?

morlovich commented 6 months ago

You can return as many regular bids at once as the auction configuration permits. (Which is 1 by default, but easily changeable).

Edit: that's if your version is new enough and you're in the experiment group, of course. Right now it's only canary/dev.

morlovich commented 6 months ago

... Beta (targeting 50%, M125) rollout started (actually late in the day yesterday, so it may actually be non-zero now)

rdgordon-index commented 6 months ago

You can return as many regular bids at once as the auction configuration permits. (Which is 1 by default, but easily changeable).

https://github.com/WICG/turtledove/pull/1048/files#diff-d65ba9778fe3af46de3edfce2266b5b035192f8869280ec07179963b81f4e624R34 is primarily focused on adComponents -- and makes no mention of 'regular bids' -- or how this would impact scoreAd (if at all) -- can you provide additional details and/or examples in the explainer so that it's clear if this is a buyer-only change -- albeit one that sellers would welcome to avoid needlessly scoring non k-anon bids?

dmdabbs commented 6 months ago

ScoreAd's spec reflects that the input will now be one-or-more (bidsBatch) generated bids. https://github.com/WICG/turtledove/pull/1055/files#diff-6f5a1d8263b0b0c42e2716ba5750e3652e359532647ac934c1c70086ae3ceddaL1700

morlovich commented 6 months ago

scoreAd is completely unaffected.

For regular bids: before you would do something like:

return { 'bid': 2, 'render': "https://example.org/boo!", };

Now you can, if config allows you too:

return [{ 'bid': 2, 'render': "https://example.org/boo!", }, { 'bid': 3, 'render': "https://example.org/whoa", }];

ScoreAd's spec reflects that the input will now be one-or-more (bidsBatch) generated bids. https://github.com/WICG/turtledove/pull/1055/files#diff-6f5a1d8263b0b0c42e2716ba5750e3652e359532647ac934c1c70086ae3ceddaL1700

No, it's not. If you look further, bidsBatch is transformed into bidsToScore, and then you do:

  1. [=list/For each=] |bidToScore| of |bidsToScore|:
dmdabbs commented 6 months ago

Yes, I was just reading that and was about to post...

Ah, reading past the diff into scoreAd, for each of the bid(s) generateBid may produce, scoreAd is invoked, so is insulated... https://github.com/WICG/turtledove/pull/1055/files#diff-6f5a1d8263b0b0c42e2716ba5750e3652e359532647ac934c1c70086ae3ceddaL1770

rdgordon-index commented 6 months ago

Now you can, if config allows you too:

Thanks for clarifying -- in other words, this is a k-anon buyer "optimization", to save both buyer and seller script runners needless effort -- but once one of the bids meets k-anon, everything downstream of that point remains unchanged. This would be a useful addition and clarification to the explainer.

MattMenke2 commented 6 months ago

It's not strictly a k-anon optimization. It does that, but the seller may reject a higher bid, or score a lower bid higher than a higher bid (e.g., maybe the bid is based on click-through rate, and it thinks the lower bid is more likely to be clicked, or the higher bid may violate a publisher's strict no-ads-for-puppies rule, being strict cat people themselves, or whatnot)

dmdabbs commented 6 months ago

violate a publisher's strict no-ads-for-puppies rule, being strict cat people themselves

😂

rdgordon-index commented 6 months ago

but the seller may reject a higher bid, or score a lower bid higher than a higher bid

Of course -- but that would be the case regardless of whether or not there's more than one candidate bid returned from generateBid, since scoreAd() still only sees them one at a time...

MattMenke2 commented 6 months ago

Yes, but a single interest group could previously only provide one bid. Want multiple bids for an ad slot? Must use multiple, independent[ish] interest groups. With this API, one group can now provide more bids. This capability may be useful to bidders who want to use fewer groups with more diversity of ads.

How many the seller sees at a time doesn't really affect that.

morlovich commented 5 months ago

I've submitted the config to make this go to 1% in stable, though of course it takes some times for it to actually reach that.

morlovich commented 5 months ago

And launch config submitted; as usual it will take some time for everyone-ish to be affected.

dmdabbs commented 2 months ago

I've submitted the config to make this go to 1% in stable, though of course it takes some times for it to actually reach that. And launch config submitted; as usual it will take some time for everyone-ish to be affected.

@morlovich what is the status for FledgeMultiBid in pre-stable?

(edited MattMenke --> morlovich)

morlovich commented 2 months ago

It should be 100% on its own... but I think it's inhibited by the mode a/mode b stuff (and I don't know the status of that)

dmdabbs commented 2 months ago

Thanks @morlovich I verified that my (Canary) instance is unlabeled: navigator.cookieDeprecationLabel is undefined.

Since browserSignals.multiBidLimit is not present, it is not on by default.

And when I do enable it with --enable-features=FledgeMultiBid, browserSignals.multiBidLimit=1

morlovich commented 2 months ago

Very strange. Code-wise it should be on by default since canary 128.0.6551.0. (https://chromiumdash.appspot.com/commit/2bd89946b1b096cf7dcfb67b648d33b210d4ed77)

Could you send me your list of Active Variations: from chrome://version (probably directly via e-mail?)

dmdabbs commented 2 months ago

My apologies, @morlovich. Too many shortcuts and overrides, versions open, &c.

  1. Verified it is on by default, modulo label assignment.
  2. Also reset my expectations regarding a "default multBidLimit" value - there is none. The seller must configure auctionConfig.perBuyerMultiBidLimits, else it's one bid only - either an object or single-entry array.
morlovich commented 2 months ago

No problem, it happens; I am just glad that nothing seems to actually be broken.

dmdabbs commented 2 months ago

I am just glad that nothing seems to actually be broken.

Nothing but my attention to detail. 🙄