WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
515 stars 216 forks source link

The need for a K-anonymity enforcement test phase #867

Open fhoering opened 9 months ago

fhoering commented 9 months ago

Currently k-anonymity is not enforced yet and AFAIK there has not been any communication yet on when it will be enforced.

As a follow up of this ticket it seems important to be able to play with different k-anonymity strategies in order to choose which one performs better than others. It also seems important to be able to test the impact of different strategies during the CMA test and before Chrome actually enforces k-anonymity.

In the discussions it is generally mentioned that buyers can already assess k-anonymity on their own with their own logs but this seems only true as long as 3rd party cookies stay available. When 3rd party cookies start being deprecated the buyer cannot have a full view to which user which ad has been displayed and cannot do any simulations on his own. Even with 3rd party cookies being available it would fully rely on server side measurements which doesn’t use the real Chrome on-device enforcement code and can therefore wrongly assess certain strategies.

This PR does one first step to add the k-anonymity flag to the reportWin API. However it seems not enough as it doesn't take into account the pass on generateBid where renderUrls that are already k-anonymous get a 2nd chance to bid and therefore also a chance to be displayed and extend their k-anonymity (which is a rolling average over 7 days).

Ideally, we need some simulation mode where the k-anonymity status can be simulated by each buyer with some kind of AB test. In order for this to work, the buyer should have access to the global k-anonymity status for each of its ads from inside generateBid and be able to increase the counters for the submitted renderUrls. The second pass to generate a bid could be simulated by the buyer himself by taking out all ads that are not k-anonymous and choose the best ad again.

Having access to a counter could permit to experiment with different thresholds. For example during the CMA testing phase PA API will be available only on at least 1% and up to 9.5% of traffic. So it would have a big impact on the k-anonymity threshold announced to be 50 over 7 days but which would apply to the full Chrome & Chrome mobile web scope. Potentially, if a full counter is not possible for divers reasons having access to a few flags like like >=50, >=25, >=10 could be an option.

Once the simulation phase is finished the ecosystem could move to full k-anonymity enforcement and the temporary k-anonymity status API could be removed.

itaysharfi commented 6 months ago

The details of the K-Anon enforcement plan can be found here

We plan to release K-anonymity initially with a threshold of K=10. Our internal evaluation has indicated that going lower than that is challenging due to the inability to add differential noise.

We've refrained from allowing the configuration of K for evaluation due to implementation issues, a lack of demand for higher K, the ability to analyze K-anon impact from logs, along with the difficulty in predicting utility with lower traffic.

Unfortunately, we won't be able to provide the exact counter, as it poses the risk of disclosing user membership in an Interest Group to anyone.

Let's discuss kanonStatus at 714.

Please let me know if it answers your question.

jaimonino commented 5 months ago

Julien Aimonino, Product Manager, Criteo

We would like to share feedback on Chrome's k-anonymity enforcement plan and re-start the discussion about our proposal above.

(1) If one thinks about testing in Q1/Q2:

Given the current enforcement plan, concretely DSPs cannot test k-anonymity in H1. We suggest Chrome to re-consider the framework we proposed in this issue. With the current Chrome’s plan, K-anonymity enforcement will actually apply when 3PCD process starts, without prior testing, which is exactly the situation we would like to avoid.

(2)

Unfortunately, we won't be able to provide the exact counter, as it poses the risk of disclosing user membership in an Interest Group to anyone.

We don't need to be able to access the counters forever. Only for a testing phase when the bidding traffic reaches a reasonable scale to be able to tune our k-anon strategy.

rdgordon-index commented 5 months ago

In order for this to work, the buyer should have access to the global k-anonymity status for each of its ads from inside generateBid

In a similar vein, when generateBid is called a second time, there's nothing to differentiate this from the initial call -- and, by extension, it's not all clear that scoreAd's choice to submit this bid for consideration to the top-level was discarded, or why -- it simply looks like the auction aborted.

rdgordon-index commented 5 months ago

https://github.com/WICG/turtledove/pull/714 does one first step to add the k-anonymity flag to the reportWin API.

What about a rejectReason in scoreAd(), which would help make it visible that a bid that was otherwise acceptable (i.e. marked with a non-zero score) was rejected because it didn't meet the k-anon threshold?

cf. https://github.com/WICG/turtledove/blob/main/Proposed_First_FLEDGE_OT_Details.md#reporting

rdgordon-index commented 5 months ago

The details of the K-Anon enforcement plan can be found here

From within that plan (emphasis added):

In Q1 2024, for up to 20% of Chrome Stable traffic, excluding Mode A and Mode B experimental traffic, we will begin to check k-anonymity with the same parameters.

Without any way to easily detect k-anon enforcement for this <=20% of traffic, for a seller's perspective, it's going to be very challenging to measure any behaviours. As noted in my earlier comments, this appears to be entirely invisible to seller in scoreAd(), as the bids are submitted independently, so there's no indication that "another" bid is being submitted due to k-anon threshold considerations. This cohort where k-anon is being enforced is also not feature-detectable on page, or by the seller, AFAIK (cf. https://github.com/WICG/turtledove/blob/main/PA_Feature_Detecting.md)

This means that any measurement of IG bid activity or volumes will be over-represented by k-anon enforcement, and potential eligibility/win rates will also be skewed as a result.

brusshamilton commented 2 months ago

The cohort where k-anonymity is enforced is feature-detectable on the page. The navigator.deprecatedRunAdAuctionEnforcesKAnonymity field will be true when navigator.runAdAuction enforces k-anonymity. Unfortunately there is a known issue where navigator.deprecatedRunAdAuctionEnforcesKAnonymity is true for a subset of the Mode A and Mode B traffic where k-anonymity is not actually enforced.

So the correct logic for detecting k-anonymity enforcement is:

kAnonymityEnforcementActive = !isModeA() && !isModeB() && navigator.deprecatedRunAdAuctionEnforcesKAnonymity

Where isModeA() and isModeB() are assumed to be functions that return true if this is part of the Mode A or Mode B experimental traffic.

Note that in Extended Private Aggregation Reporting, a seller rejection reason of 8 indicates that the creative would have won the auction except it failed the k-anonymity check. See https://github.com/WICG/turtledove/blob/main/FLEDGE_extended_PA_reporting.md#reporting-api-informal-specification.