FLEDGE auctions e2e latency (on-device)

At RTB House, we are aware that higher latency of ads in Fledge compared to Classic RTB system could have significant impact on final metrics and overall success of transition to Fledge. This topic was previously discussed in these issues: LINK#215 and LINK#385.

Today, due to the disabling of third-party cookies for a portion of Chrome browsers (Mode B testing), event-level reporting, and the fact that we managed to integrate our Fledge implementation with various SSPs, we are able to compare Fledge and Classic end-to-end bidding latency. In this issue, we would like to share our internal findings.

Scope:

On-device only
Bidding and Auction services is out of scope
We're interested in various environments, various devices and various network types

We define end-to-end bidding latency as the time elapsed from the bid (start of bid request processing) to the impression (start of ad rendering). This means that we compare:

Fledge: rendering request ts (server-side) - contextual bid request ts (server-side)
Classic: rendering request ts (server-side) - classic RTB bid request ts (server-side)

We conduct measurements on the following segment of Internet traffic:

Fledge: treatment traffic with third-party cookies disabled - which means bidding for users with treatment_1.X (X=1,2,3) labels from Mode B
Classic: legacy traffic with third-party cookies enabled - which means bidding for users without Mode A and Mode B labels

Each of the following diagrams consists of:

A histogram showing the distribution of latencies
A table of latency percentiles (p50, p80, p90)

Comparison between Fledge (treatment traffic) and Classic (legacy traffic)

The dataset used to generate the diagram includes impressions (from both Fledge and Classic), meaning winning auctions that resulted in ad rendering, for various SSPs and ad slots.

all-ssps

.	fledge_imps.bid_to_rendering_time	legacy_imps.bid_to_rendering_time
p50	2999 ms	993 ms
p80	5453 ms	2254 ms
p90	6630 ms	4140 ms

Comparison between Fledge and Classic: split by device type

The same dataset, segmented by device type (PC and PHONE).

all-ssps-pc

.	fledge_imps.bid_to_rendering_time	legacy_imps.bid_to_rendering_time
p50	1919 ms	688 ms
p80	4721 ms	1683 ms
p90	5937 ms	3375 ms

all-ssps-phone

.	fledge_imps.bid_to_rendering_time	legacy_imps.bid_to_rendering_time
p50	3833 ms	1053 ms
p80	5745 ms	2336 ms
p90	6954 ms	4224 ms

Comparison between Fledge-over-RTB and Prebid-over-RTB

In this part, we use results from our own tests. For Fledge-over-RTB and Prebid-over-RTB, we buy impressions via direct integration or Classic RTB on real publishers' pages. Then, depending on the test scenario, we perform one of two actions:

run our own Fledge auction (Fledge-over-RTB)
run a Classic auction using Prebid (Prebid-over-RTB)

These tests allow us to compare the latency of Fledge (Fledge-over-RTB) and Classic (Prebid-over-RTB) impressions independently from Fledge integration with SSPs and other buyers, as there are no other buyers participating in the auction in both scenarios.

fledge-over-rtb

.	fledge_over_rtb.bid_to_rendering_time	prebid_over_rtb.bid_to_rendering_time
p50	1562 ms	189 ms
p80	2961 ms	336 ms
p90	4499 ms	531 ms

To eliminate the possibility that our implementation of bid logic in Fledge is the source of the problem, we repeated the Fledge-over-RTB experiment, this time completely removing the part responsible for model evaluation, both server-side (in contextual and TBS requests processing) and client-side (in the bidding function). In our case, this means reducing the size of contextual and TBS responses by over half, as well as decreasing server-side and client-side computations by over half. After such intervention, the histogram latency did not decrease.

dummy-fledge-over-rtb

.	fledge_over_rtb_dummy.bid_to_rendering_time	fledge_over_rtb_regular.bid_to_rendering_time
p50	1576 ms	1562 ms
p80	3056 ms	2961 ms
p90	4928 ms	4499 ms

Conclusions

To sum up our results, we can see that in the Classic, 50% of auctions last less than 1 second, while in the Fledge, which usually lasts three times longer, more than 50% of auctions for mode B users are taking over 3 seconds. The situation is even worse in the case of mobile devices because the difference in latency, when we limit it to auctions on phones, increases from 3x to 4x.

Results from our internal tests comparing Fledge-over-RTB and Prebid-over-RTB indicate the overhead on the Fledge stack. In a very simple setup, where we are the only buyer in the Classic system, 50% of actions take less than 200ms. In the case of Fledge with the same setup, these auctions could last up to 8 times longer. Additionally, the fact that after reducing computations and the size of contextual and TBS responses significantly, latency did not decrease, along with the observation (as we verified) that our bidding function is fetched from the cache in 95% of auctions, suggests that neither processing nor fetching the bidding function in this case is the bottleneck.

At RTB House, we still believe that migrating to the Protected Audience API is feasible without losing retargeting potential. Although remaining concern is the current on-device implementation, where resources dedicated to Fledge auctions are limited and shared by multiple buyers and SSPs altogether significantly impacting e2e latency. It is important to resolve such fundamental concerns before removing support for third-party cookies. Additionally, we believe that transitioning early to Bidding and Auction Services could be a solution although we couldn’t perform similar measures yet due insufficient traffic.

barteklos / turtledove