WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
521 stars 222 forks source link

Trusted Server - caching / Trusted Daily Update #333

Open jonasz opened 2 years ago

jonasz commented 2 years ago

Hi,

At RTB House we are working on reducing the end to end FLEDGE auction latency. One of the elements we're investigating is Trusted Buyer Server (TBS) roundtrip.

From our perspective, the TBS will be used to serve different types of data: some are latency-sensitive (like budgeting / campaign config), others less so (product availability).

The latter use case - product metadata - will also use up significantly more network bandwidth, which may contribute to greater latency. (You can think ~100 products per IG, times multiple IGs - this quickly adds up.)

In https://github.com/WICG/turtledove/issues/290 it was also reported that budgeting data is characterised by stricter latency requirements than other types of data stored in the TBS.

This observation may lead to certain optimization ideas: a) Caching TBS responses on the device. (Perhaps allowing the TBS to determine which key/value pairs may be cached, and for how long.) b) A more general mechanism: "Trusted Daily Update". If we had a periodically scheduled call to the TBS, this could be very useful for updating other parts of the IG, like adComponents and userBiddingSignals. (The current daily update is not feasible for this use case, as it is subject to a k-anonimity threshold.)

Option b) is actually much more powerful than a), so may deserve a separate issue. If we have b), however, a) may not be needed anymore.

We are working on a separate issue, describing our understanding of the latency bottlenecks of the FLEDGE auction, but I think it would make sense to discuss the TBS optimizations separately. I'm very curious how the general idea sounds to other participants.

Best regards, Jonasz

palenica commented 1 year ago

Hi Jonasz,

thank you for your comment (which was pointed out to me recently). I agree that a trusted daily update mechanism would be more powerful than the current FLEDGE proposal that requires k-anonymity for daily updates. I'd be open to further discussion on this topic.

If you don't mind, I'd like to ask you to elaborate: why do you believe (b) is more powerful than (a)? It seems you need similar data in either case, the only question is whether the data is sent to the client in real time or in a background sync. This to me seems like a tradeoff between real-time latency and bandwidth (if you prefetch, you are likely prefetching a lot of data you ultimately won't need).

As an aside, would it be at all interesting to discuss whether the amount of data sent to client could be reduced (e.g. if you had better real-time filtering capapbilities in a trusted KV server, perhaps using targeting signals that are currently not available to KV servers)?