3pc deprecation: ramp-up duration

jonasz commented 1 year ago

Hi

I'd like to signal our concern with the cookie deprecation ramp-up being planned to only take 2 months, according to the timeline at https://privacysandbox.com/open-web/#the-privacy-sandbox-timeline.

The 1% test next year is an important step, but will not allow us to predict what will happen during the broader deprecation - with a larger scale, the ecosystem dynamics will differ significantly. To ensure a smooth transition, we think the ramp up should be more prolonged, gradual, and known in advance.

If interested, please see some more thoughts on this at https://blog.rtbhouse.com/the-importance-of-gradual-third-party-cookie-deprecation-for-the-successful-transition-of-the-advertising-ecosystem/

I'd be eager to learn what others think.

Best regards Jonasz

thegreatfatzby commented 1 year ago

I agree overall. After much discussion, including with the folks monitoring these threads, I think I'd frame it as:

B & A and Infra Operations: I understand that the current set of APIs has been developed for quite some time, support many use cases, will raise the bar on privacy, and are thought of as a 1.0. Whether I agree that the APIs or at MVP/1.0 feature wise or not, I think the less well developed end of this is B & A. The infrastructure of this new ad tech system will be just as if not more critical to the successful release and improvement of privacy, as the APIs themselves, especially given how much of the value add of ad tech is in low latency at scale. In other words, if your logical APIs are at 1.0 but your infrastructure is not, and you're supporting millions of requests per second with sub second latency needs, then you're not at 1.0.
Sequencing: these APIs are coming around to being fully released in production (all the Chrome's), and we'll start seeing some real traffic in January 24, followed by deprecation from July to September 24. In general when releasing a production system the sequencing would be to 1) release the new thing then 2) shift traffic gradually from old to new thing then 3) make sure things are working for a while and go through dev cycles to fix logical and scaling bugs and then 4) begin to slowly deprecate the old system. In this case it seems we are assuming a pretty happy release path and skipping, or giving short shrift, to (3). Especially given that the dev cycles here won't be 2 week sprints where a small team of engineers do as their Product Manager commands (engineers never push back, right?), I think (3) needs more time and structure.
Evolution of APIs vs 3PC Cliff: I am completely onboard that there will be iterations over time to these APIs to better support issues and use cases as they come up. I want to emphasize that I see the amount of resources Google has put in, is planning to put in, and the level of care the individuals working on this have. I think the disconnect I have is that iterating on the APIs and getting new functionality in in years 2, 4, etc, doesn't help a competitor who is out of business because the 3PC went away in 2024. While I don't think browsers should be designed to support any particular business model or business, the unique challenges that Google (and frankly my company, Microsoft) face with their combination of businesses, it warrants discussion.

I think those three together present challenges. I'm up for creatively problem solving on these things, because I do think this is an excellent and ambitious effort that I want to be a part of.

thegreatfatzby commented 1 year ago

@jonasz I am curious if this is something you'd want to bring up in a WICG meeting or some such?

dmdabbs commented 1 year ago

We raised testing, not the cookie deprecation ramp, in a FLEDGE call this summer. Since it is a cross-feature topic, it was deferred to some future, more appropriate venue. When the Sandbox team schedules 'office hours' calls, I expect this will be a leading topic. PS team intends to provide "an update on the testing modes in mid-August," so perhaps that is when they will schedule those calls.

thegreatfatzby commented 1 year ago

@alextcone-google and others, had a thought on sequencing criteria. Google (and any other similarly vertically integrated digital company) faces a difficult balancing act here of herding the industry without the criteria being "everyone is happy". You are trying to shift the design and operations of an industry you have a lot of sway over, in a way that will have an impact on engineering as well as revenue.

Similar to withdrawals from unpopular wars, withdrawals from unpopular APIs with hard date based deadlines may lead to regrets, but with no deadline nothing happens. How to generate that pressure without infinite delay tactics...twixy...well, thought experiment:

When I hear someone say "I have concerns that Private Advertising as a whole, or a specific implementation of it, would hurt my business, and you should wait till I can adjust", I hear the concern but think it can only last but so long as browsers shouldn't owe anything to specific business models. Whether 3 years is enough is an interesting question, but I don't think that concern on it's own justifies ongoing delay.

When I hear someone say "I have concerns that this implementation of Private Advertising doesn't sufficiently cover existing use cases, and you should add them before releasing", I hear that concern but have a similar reaction as above.

However, When I hear someone say "I have concerns that the sequencing of this release is ignoring the substantial operational challenges we are likely to hit in such a large shift", I think that makes sense in a different way that should matter to Google or any other device/browser maker shifting to PPA, because I think the browser should owe stable operations to the ecosystem it is in symbiosis with.

In other words, a business that is inherently harmed by the idea of private advertising I don't think has a good case; but a business harmed incidentally due to a lack of operational readiness on the part of the browser has a pretty good case. Soooo...

I wonder if Google Chrome (and other similarly situated vertically integrated digital companies) could use a combination of dates and functional requirements, but with functional requirements that would demonstrate engineering viability at scale, rather than any revenue based goal. We can spitball on what this means exactly, but the criteria could be that at least one company is getting X latency at Y volume with Z stability for T time in prod.

Of course, the challenge is getting that "at least one company", but the nature of the vertically integrated case provides I think an interesting answer: the browser and cloud provider has an ad tech that can help with that.

Google/MSFT are in an interesting position to demonstrate engineering reliability at scale, and while there will always be items we can't tick off till everyone uses it for 100% of traffic, I think some level of engineering reliability at scale is a reasonable ask of such companies and would provide valuable testing time, incentive, and buy in through dog-fooding:

"would you rather that frustration be passed on to your company’s customers? Of course not!" - jah!

alextcone-google commented 1 year ago

Thanks for sharing your thinking. I'm curious how you see your suggested approach differing from the extension we announced last summer and the CMA's testing guidance.

jonasz commented 1 year ago

@jonasz I am curious if this is something you'd want to bring up in a WICG meeting or some such?

Yes, I think this is a good idea, I'll add it to the agenda for the PA API WICG call.

thegreatfatzby commented 1 year ago

@alextcone-google thanks for those links, I have done 1.5 passes through that and some of the original documents from CMA and think I understand some of the goals of the Mode B testing, CMAs listed concerns and Google's responses, and their request for specific metrics, better than before. It has been quite helpful, and fun, to add this to my thinking.

I need a bit more time to meditate with the documents, which I hope to have later this week or next week, at the very least before TPAC. But, to try to practice "perfect is the enemy of the good" and put some thoughts that are not entirely sorted.

Specific to My Above Comments

The main gap between what I'm asking for and what I've seen so far in the docs is specificity around "metrics to prove this is sufficiently working" and "criteria to decision". So far my sense is that the CMA is asking for general business relevant metrics, both included in their list (revenue, performance, clicks, convs, etc), and not included, from the industry to help them in their thinking as to whether to close the investigation, and I haven't yet seen any particular criteria from them.

My above thinking would push for something a bit more specific, with pieces on the operational side that focus on long term viability at scale, of the kind we'd typically want when building a business critical system (which this is).

This may reflect the state and nature of this process: the folks at CMA may very well be thinking the exact same thing, not know what to put down, and we just need to push that general understanding they have towards specifics that would sustain competition in the long term, and work forthrightly through some discomfort in the uncertainty while we all help to establish that kind of specificity. If that's where we're heading, then hooray.

CMA, Competitive Concerns, Commitments, Clarity, and Mode B Testing

Now that you have exposed me to this new level of thinking, let me try to factor ^ into something broader.

Mode B Testing It seems the thinking is that the impact on the market can be assessed with a stable testing period which:

Includes a) only the on-device auction as currently available in Chromium and b) only Chrome and not Android.
Has commitments from Google to not favor it's open internet ads business with data from Chrome usage, specific features, etc.
Compares metrics from the non-Google Open Internet Ad Techs and Google's Open Internet components, and then decisions based on some criteria.

The idea is that (1) gives sufficient API and System clarity for Mode B testing between Jan 1st and July 1st to allow market impact to be assessed in real-world conditions. In particular this will give CMA enough to decide whether it feels it needs to continue its investigation**. (2) will prevent longer term market centralization due to dominant market positions held by different components of a vertically integrated company.

This does not seem sound to me, and I think I think that a lot of my thinking here is related to on-device vs server based auctions.

Market Clarity (1a) is insufficient as I believe the move from an On-Device-Only design to On-Device-And/Or-B&A-Server-Side design, is disruptive to the the stable testing period and it's targeted analysis, and the industry, in a way I don't see discussed in CMAs documents. Uncertainty in the infrastructural component of implementing any system with ad tech's complexity, scale, and latency needs, is just-as-or-even-more damaging to planning than changes to the API itself.

(1b) I haven't thought as much about so I'll leave that for now.

Market Competition Concerns (2) I think is misleading: preventing centralization in the Open Internet Advertising Market is good, but that is only a part of the Entire Digital Advertising Market, and the balance between Open Internet Advertising and Walled Garden Advertising within that larger market is more impactful to consumers in the long run. On Device for 3PD auctions is likely to favor Walled Gardens and shift the larger market in that direction, favoring Google and MSFT.

Google Open Internet and Other Ad Tech Open Internet may (may) compete on a level'ish playing field with On-Device only for Open Internet Publishers, but On-Device-Only significantly favors Walled Garden Advertising because those auctions don't need to run the Fledge auction.

Any auction in which 1st party data is the primary revenue engine can continue to run through existing pipes with no disruption to operations, optimizations, features, etc. This can take advantage of decades of highly technical optimizations to ensure page load times are fast, features work, and reporting is accurate.
In contrast, as currently designed, any ad tech running an auction in which the 3PD component is the primary revenue generator will have 2-4 figures of bidding functions to run through, per auction, on pages that typically have between 3 and 20 ad slots (I'll look for better numbers here). The work to make this run reliably in a cost effective manner has just begun, may not be possible for on-device given current generally available consumer technology, and will likely take beyond Q3 2024 to get to a reasonable first cut.
Many ad techs also don't want to put sensitive tech onto millions of browsers, meaning On-Device won't work for them, whereas a Walled Garden auction can use existing pipes and therefore doesn't have this concern.

So while On-Device may (may) be a level testing field for Google and other Ad Techs specifically w/r/t Open Internet Advertising, I think it is misleading because it is not level between Google/MSFT/any-vertical-with-walled-garden and Open Ad Tech. Once 3PC are gone, sandbox iterations years in the future won't undo walled garden centralization in the larger market of All Digital Advertising.

A move of publishers from Ad Tech Open Internet Advertisers to Walled Garden Advertising would be far more disruptive to content competition in the long run, and assessing one without the other is likely to miss key long term content-and-advertising market changes.

(3) is discussed in the first section.

Questions...

Can you point to performance metrics on the on-device auctions, in particular w/r/t sites with many slots to fill which each need to run their own auction, how they will perform in-browser and with lots of calls going over the wire to the KV servers?

Finally, I can see a response here being: yep, that should all come out in the metrics in Mode B...I will think about this, but as long as we analyze that, then good.

** I would be very curious to understand CMA's null hypothesis here, i.e. does it see this as "this is anti-competitive and we need to see proof it's not" vs "we want to take a hands off approach but are open to proof that it's a problem".

alextcone-google commented 1 year ago

@thegreatfatzby, your comment appears to be directed at the CMA's approach. The only question I see that's directed at Chrome is whether we can point to performance metrics for on-device auctions. You answered that question for yourself. I appreciate that you looked into the commitments and CMA's testing guidance. We encourage anyone who is considering testing to use the CMA's guidance when designing their test plans.

jonasz commented 11 months ago

I'd like to follow up, and to keep the discussion alive. At RTB House, we think that knowing the schedule ahead of time, and the shape of the schedule itself are among the most important factors in the success of the 3pc phase-out. We'd like to make the discussion a bit more concrete, and propose a potential next step after the 1% deprecation: to increase the fraction of cookieless users to 10%. If interested, please see some more thoughts on this here: https://blog.rtbhouse.com/privacy-sandbox-whats-next-after-1-cookie-deprecation/ Woud be great to hear your thoughts on this!

JoelPM commented 11 months ago

I'd like to voice support for the idea of ramping to a 10% test following the 1% test. @jonasz does an excellent job of articulating the challenges of a 1% test and the benefits that a 10% test could have.

drpaulfarrow commented 10 months ago

I'd like to also publicly voice support for this 10% proposal. Totally agree. Having a 10% sample for an extended period of time would greatly enhance our ability to measure meaningful metrics, as @jonasz articulates very well.

lbdvt commented 9 months ago

Thanks for this proposal.

I'd like to support it with two additions:

There should be testing steps at 10%, 20%, and 50%
This should not be a gradual, automated, 3PC deprecation ramp-up, but testing steps, i.e. moving to the next step should be gated by positive results in the previous one. Indeed, many market effects will only be visible with enough traffic, advertising budgets involved, and market participants.

drpaulfarrow commented 9 months ago

Totally agree with Lionel.

bmayd commented 9 months ago

My sense is there is general agreement, certainly among engineers and builders, that ramping cookie deprecation in steps, with pauses to assess and understand impacts, is prudent and the preferred approach; It gives us time to assess the impacts on AdTech use cases and, more importantly, to assess impacts on the broader web ecosystem. There will likely be impacts we have not anticipated because our focus has principally been on AdTech and arriving at a model that works; as far as I'm aware, we haven't given a lot of thought to how the new model scales, what knock-on effects scaling it may have or what the impacts of scaling cookie deprecation will be more generally.

Although we can make some assumptions about general impacts cookie deprecation will have based upon what we've learned from browsers which currently limit cookies like Safari, one of the key learnings from our experience with them is that a percentage of the web doesn't work without cookies and in those cases the general solution has historically been to fall back to a browser that isn't cookie restricted. Given that removal of cookies from Chrome represents the removal of the only reasonable fallback for most of us, we would do well to move at a pace that affords adequate opportunity for websites and their services providers to remedy discovered problems and minimizes the scale of adverse impacts.

It is also very likely that on the way from 1% to 100% we're going to hit tipping points where ecosystem dynamics change radically. If we move too quickly, we increase the risk of crossing multiple thresholds concurrently and introducing confounding chaos that will make regaining balance and equilibrium much more difficult than it would be if we crossed thresholds individually.

I strongly support the RTB House call for action on developing a phase-out plan and suggest making it a top priority so we can properly anticipate when changes will happen and plan accordingly. This is particularly important given that the current phase-out timeline posted on the Privacy Sandbox site, which I believe assumes only a single 60-day standstill period, suggests cookie deprecation will go from 1% to 100% between the beginning of September and end of October. This leaves us extremely little time to fix problems ahead of the critical holiday advertising and shopping season with attendant Q4 code freezes and even less time to address potential impacts on a number of significant elections happening around the globe in the latter part of this year.

This is also assuming there is only a single standstill period; If the optional additional 60-day standstill is imposed, the current expressed plan would push the beginning of deprecation to November and the ramp down would entirely overlap the most important portion of the Q4 season at a time when most of the web goes to great lengths to minimize change and maintain stability.

I think it would extremely helpful to all concerned to have a clear and relatively detailed idea of what we should expect the cookie landscape to look like through the latter half of this year and what the criteria would be for deciding when and how fast and by how much the ramp-up would proceed, for both the single standstill scenario and if the second standstill period was to be invoked.

remysaissy commented 9 months ago

Hello, I would like to point Teads interest in having a more progressive approach in the ramp-up as well.

lknik commented 4 months ago

@michaelkleber Considering latest developments, a new approach may need to be devised. How about increasing the batch to 100% in a month? =)

WICG / turtledove