WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
539 stars 238 forks source link

FLEDGE: joinAdInterestGroup() based on cumulative activity #124

Closed jnsprngs closed 3 years ago

jnsprngs commented 3 years ago

joinAdInterestGroup() allows for adding browsers to an interest group based on a single event (pageview). It also has a built-in recency adjustment mechanism by automatically expiring users from the interest group after 30 days. However, audience segmentation in practice today is based on more than the receny of a single relevant event. It's normally also based on the frequency of an event.

Take for instance, a browser that views 1 article about the Seattle Sonics. Viewing a single article may not be a strong enough signal to add the browser to a Seattle-nostalgia interest segment. However, if that browser views 5 articles on the topic across 3 websites over the course of a weekend, the cumulative activity is a much stronger signal and therefore would make sense to then add the browser to the interest segment.

Can joinAdInterestGroup() functionality be extended to collect events over a period of time and conditionally add browsers to interest groups based on a ruleset passed during the function call? While this logic can be managed via first-part cookies if all activity occurs on a single domain, there would be now way to do this cross-domain and gathering activity cross-domain where the sum of it's parts is greater.

appascoe commented 3 years ago

While the doc mostly focused on reporting, at NextRoll, we've been thinking that a mechanism like https://github.com/AdRoll/privacy/blob/main/SPURFOWL.md could be used to unlock functionality like this.

jdelhommeau commented 3 years ago

I believe that this is the use case that https://github.com/1plusX/swan/watchers is also trying to solve: allowing several sites to build profile based on single events. Then an audience building script that can only be executed by the browser with no network access can access those profile information to build interest group.

anielo commented 3 years ago

Yes, the SWAN proposal would make it possible to collect signals originating from several domains. The domains would be grouped using the notion of first-party-sets that is presented in this W3C CG proposal.

With the consent of the user, the audience definition would also have access to 3rd-party signal using a 3rd-party-set declaration that is very similar to the server sided first-party-set declaration.

The design of SWAN is such that the profile (the set of collected signals) cannot leave the browser so as to preserve privacy.

michaelkleber commented 3 years ago

FLEDGE did indeed only propose the simplest possible way for joining interest groups to work, and there are a variety of other proposals in this space, still under discussion.

I will note that interest group memberships based only on a single site's data are safer, from a privacy point of view, while group memberships informed by cross-site data suddenly open up the possibility of the group membership itself being a piece of information about a person that nobody previously knew. I'm not sure that's compatible with the temporary event-level reporting plan.

jdelhommeau commented 3 years ago

Can't this be said from FLoC @michaelkleber ? FLoC will be doing cross site tracking and establish user's membership to a cluster, providing new information to people accessing the user's FLoC.

SWAN seems like FLoC meets Turtledove. Could you elaborate on why you think this poses a greater threat than what FLoC / Turtledove do?

michaelkleber commented 3 years ago

@jdelhommeau FLoC does indeed do something that is riskier than TURTLEDOVE-style interest group memberships! That's why we've been putting so much work into adding protections to avoid misuse. I'm sure FLEDGE users don't want the browser instituting FLoC-like protections, e.g. requiring each interest group to have thousands of people in it or having Chrome block their interest groups based on some correlation with sensitive browsing behavior. This is exactly why we need to be careful of how interest groups can be formed.

jdelhommeau commented 3 years ago

Thank you @michaelkleber for the clarification, that makes sense.

anielo commented 3 years ago

Thanks for your clarifications @michaelkleber. From my perspective your argument regarding the size and sensitivity of interest groups also holds for TURTLEDOVE (without SWAN or similar). A small or sensitive interest group is equally problematic whether it is computed based on data from one or from several domains. Could you please explain to us why this would different?

michaelkleber commented 3 years ago

In TURTLEDOVE without any add-ons, the targeting criteria that are available for showing individual ads are the same ones available for creating interest groups.

Certainly it is possible for a site to use whatever data it has about people to target ads in a discriminatory way. Separating that into two steps (put into interest group, target interest group) doesn't particularly change the capability — anyone targeting ads of course needs to be aware of how they are doing so, but the same considerations apply in both cases.

jnsprngs commented 3 years ago

Hi @michaelkleber,

Can you elaborate with an example of what you meant here?

group memberships informed by cross-site data suddenly open up the possibility of the group membership itself being a piece of information about a person that nobody previously knew.

It seems to me that similarly, as discussed here, allowing some-ssp.com to add a user to an interest group owned by some-buyer.com has a similar outcome of revealing cross-site information that was not previously known by some-buyer if a user later visits some-buyer.com's site.

michaelkleber commented 3 years ago

First let me point out that when you go to some-buyer.com, there is no browser-provided way for them to know that you happen to also be in one of their interest groups.

But during the early FLEDGE stage of this effort, while we still allow event-level reporting, it is possible to learn what interest group's ad appeared on a page. As long as information about group membership has a way to flow out, we need to be more careful about how groups are constructed.

jnsprngs commented 3 years ago

Yes, okay clear.

As folks have noted in this thread, there's been a few attempts to allow for interest groups based on cumulative activity and given the understanding that more observations translate into better constructed interest groups -- I'd like to see us include this type of feature in testing so we can better understand to what extent group membership divulges new information about browsers and how it can be (ab)used.

Thanks.

eroncastro commented 3 years ago

First let me point out that when you go to some-buyer.com, there is no browser-provided way for them to know that you happen to also be in one of their interest groups.

@michaelkleber So, let's suppose we have groups A e B that are allowed to bid on a device auction at publisher.com. Since we have no idea about the browser GIs and that we want to ensure A e B will participate in the auction, we must add the browser to those groups before starting the auction?

michaelkleber commented 3 years ago

@eroncastro I don't think it makes much sense to put a person into an interest group and then use that interest group immediately, on the same page. The key reason that interest groups are useful is because they contain information about what ad campaign someone might want to see, based on some action they took before getting to the current page.

So you would add a person to interest groups A and B when you saw that person doing something that made you believe you want to show them particular ad campaigns, and then when the person gets to publsher.com they might see ads targeting groups A or B.

jnsprngs commented 3 years ago

@michaelkleber I would say your reasoning would extend from before to before, during and after. Trouble with after is the whole browser cookies are not fortune cookies thing. But the current context is incrementally more value than only knowing information about what happened before.

But maybe there's a privacy reason not to allow adding users to an IG immediately.

sounding15 commented 9 months ago

joinAdInterestGroup() allows for adding browsers to an interest group based on a single event (pageview). It also has a built-in recency adjustment mechanism by automatically expiring users from the interest group after 30 days. However, audience segmentation in practice today is based on more than the receny of a single relevant event. It's normally also based on the frequency of an event.

Take for instance, a browser that views 1 article about the Seattle Sonics. Viewing a single article may not be a strong enough signal to add the browser to a Seattle-nostalgia interest segment. However, if that browser views 5 articles on the topic across 3 websites over the course of a weekend, the cumulative activity is a much stronger signal and therefore would make sense to then add the browser to the interest segment.

Can joinAdInterestGroup() functionality be extended to collect events over a period of time and conditionally add browsers to interest groups based on a ruleset passed during the function call? While this logic can be managed via first-part cookies if all activity occurs on a single domain, there would be now way to do this cross-domain and gathering activity cross-domain where the sum of it's parts is greater.