Incrementality testing and optimization?

ablanchard1138 commented 3 years ago

Hi,

In my opinion FLoC focuses too much on attribution based advertising, and does not offer any solution for more "truthful" measurement like incrementality. Do you plan on supporting it? If not I feel like FLoC constitutes a stepback in our goal to improve Web Advertising.

It is reasonable to assume that clustering people by interests as emulated through their browsing history will likely lead to a good correlation between the characteristics underlying the clustering of these people with the actual attributed outcome of the advertising event. In simpler words, a flock of "smartphone enthusiasts" (people who have read smartphone reviews, or browsed smartphone product pages on ecommerce websites etc) could end up buying a smartphone anyway (regardless of seeing an ad or not) - and a post-view or post-click attribution would give credit to ads done to this "organic buyers" group, indistinct from the group on which there was an actual causal effect.

That is a long-known flaw from attribution based models, and many marketers moved / are moving away from it to focus on incremental lift. Cookie-based targeting and measurement allow for incrementality testing and piloting of marketing campaigns, and theoretically TURTLEDOVE/FLEDGE could too (one could imagine that marketers could build their cohort with an incremental goal in mind, and measure it by deduplicating cohorts into test and control groups - not the most convenient, but it could happen).

However, I don't see how we could measure and optimize for Incrementality in FLoC:

the rules on which how cohorts are constituted will be largely unknown (and we can assume that people are not clustered together because of similar potential incremental ad lift),
there is no mean to exclude users from a flock to constitute a control group,
outside of the flock_id there is no variable that would allow marketers to remove "organic buyers" from these flocks (for example: people who bought a fridge online will still be in the fridge flock, but we can be pretty certain that they won't buy another one anytime soon, so no need to do more ads to them, both for them, the advertiser, and the publisher).

I understand that in the Privacy Sandbox the measurement proposals are separated from the Targeting ones, but in the incrementality context, the two pieces need to be connected for ABtesting to be possible. Do you envision support for such ABtesting capabilities in FLoC?

michaelkleber commented 3 years ago

In a world where you had flock and nothing else, the best approach to A/B testing would presumably be to target your ads at some flocks and not others. A targeting model that predicted an advertiser's p(conversion | flock) would probably indicate a bunch of flocks with about the same value. Holding back a random subset of them seems reasonable, even if the clusters mean it is noisier than randomly holding back targeting on a person-by-person level.

It seems to me that any A/B diversion other than flock would require rendering the chosen ad inside a Fenced Frame and using Aggregate Reporting, for the usual tracking reasons. The PTEROSAUR proposal from @gjlondon explores that idea (not sure why that's in a non-merged pull request) and seems in line with the infrastructure we already plan to build.

gjlondon commented 3 years ago

The PTEROSAUR proposal remains in PR because I expect it'll need revisions when more details become available about how Fenced Frames will work and about what reporting mechanisms will be made available in TURTLEDOVE/FLEDGE. (Perhaps I should go ahead and merge it and make revisions in a new PR if/when it's necessary).

On this specific point, it may help to clarify a bit what's meant by incrementality. "Traditionally", incrementality estimates the counterfactual impact of making an "intervention" in a population, e.g. showing an ad. So you could estimate p(conversion | served ad) - p(conversion | not served ad).

Since FLOCs (AFAIK) are determined by the browser and cannot be controlled or influenced by advertisers, it doesn't really make sense to think about p(conversion | in flock 1) - p(conversion | in flock 2) since "moving a user from flock 1 to flock 2" isn't a possible intervention.

We could however ask about interesting joint probabilities, e.g. p(conversion | in flock 1 AND saw ad) - p(conversion | in flock 1 AND didn't see ad), which would tell you the counterfactual impact of advertising to people in flock 1. You could then compare that to the counterfactual impact of advertising to flock 2 and optimize spending towards flocks on whom your particular ad campaign is most effective.

In order to calculate those joint probabilities, we'd need to way to make users' flock IDs available to an incrementality measurement mechanism like PTEROSAUR. That certainly seems possible in principle, but whether it'll be possible in practice will depends on what reporting mechanisms are made available to PTEROSAUR (which, as mentioned above, still seems to be an open question but hopefully will be worked out as part on the ongoing WICG discussions of FLEDGE).

wanderingrover commented 3 years ago

Any idea how Advertisers can target FLoCs? I'm assuming that a Publisher will make a FLoC available in the bid stream and an ad network/DSP can read the FLoC for targeting depending on the FLoCs that align with the Advertiser's targeting use case? And how does the system decide if to ad should be served based on contextual data, a FLoC, or an interest based audience via Turtledove? @michaelkleber

michaelkleber commented 3 years ago

I'm assuming that a Publisher will make a FLoC available in the bid stream and an ad network/DSP can read the FLoC for targeting depending on the FLoCs that align with the Advertiser's targeting use case?

Yes, that's it!

And how does the system decide if to ad should be served based on contextual data, a FLoC, or an interest based audience via Turtledove?

There's no need to choose among contextual data, first-party publisher's data, and FLoC — we expect a bid request to contain all of them, and they can all be used together for targeting.

TURTLEDOVE is indeed a separate path. It involves an on-device auction, where the winning bid price from the contextual+1p+FLoC auction can act as a floor.

WICG / floc

Incrementality testing and optimization? #39