ampproject / amphtml

The AMP web component framework.
https://amp.dev
Apache License 2.0
14.89k stars 3.89k forks source link

I2I: Make RTC’s timeout mechanism more flexible #38434

Open zshnr opened 2 years ago

zshnr commented 2 years ago

Summary

Permutive is an Audience Platform for publishers and advertisers. We have a strong focus on privacy and first-party data: enabling publishers to use their data to power advertising and personalization in real-time, across every device, browser and environment, while respecting user privacy.

One of the essential capabilities we provide publishers is the ability to dynamically target their ads, based on a user's behaviour.

Our previous proposals for amp-script, amp-ad, and amp-analytics have allowed publishers to personalise the user experience on AMP as they would on regular websites.

This proposal builds upon our previous proposal for amp-ad, #33581, which was implemented in this PR #33872, and puts forward a use case for users of RTC config to have more granular control over timeout as well as RTC config having better support for amp-script URIs.

Design Document

  1. We will need to introduce a polling fn with a timeout to wait until the amp-script in question has exposed the function that RTC needs to call to get targeting data.
  2. Changing the timeout behaviour might be as simple as respecting the value that has been passed in as opposed to enforcing a maximum of 1000ms

Motivation

The Permutive SDK enables publishers to target ads in real-time without having to send data back to our servers for processing.

To facilitate the way our SDK works and is deployed for publishers, we worked with the AMP team to bring a new mode for amp-script called sandboxed mode, which was implemented here #33643.

This allows our SDK to run client-side on AMP, segmenting users in real-time without their personal data ever leaving their device.

One key feature of the changes we worked on with the AMP team, outlined in the issue mentioned above, was the ability for amp-ad components to fetch targeting information from a function exposed by an amp-script script. For our SDK, this meant that we can take advantage of our real-time segmenting technology to pass on up-to-date segment data to amp-ad that are all evaluated on the client.

However, since trialling this with a few of publishers we have run into performance issues.

As our SDK is served via our CDN, the combination of fetching this, as well as the time it takes to bootstrap a sandboxed amp-script component means that amp-ad RTC calls are often made before our SDK is ready to provide information to external services and components.

This is especially troublesome on low powered devices and low bandwidth connections, where our SDK is usually too late to pass on targeting information to amp-ad

Our proposed solution to this is two-fold:

  1. As with HTTP requests to fetch targeting data, the timer for amp-script URIs should start once the call to the exported function is made.
    1. This would inevitably involve having a polling function that keeps on checking until the amp-script component in question is ready and has exported the required function.
    2. Once the function is ready, RTC will call that fn and then begin the timer until the timeout to wait for a response
  2. A more customisable RTC timeout. Currently, RTC timeout has a maximum value of 1000ms. We appreciate that this is to keep amp-ads as performant as and is a reasonable maximum timeout for HTTP RTC calls.
    1. We propose making this more customisable, either by increase the maximum limit, or leaving the maximum limit open to the publishers to decide
    2. We propose allowing the publisher to make the trade-off between better targeting performance and a longer maximum delay to triggering the ad request in this case.

Alternative Solutions

  1. Changing the localStorage.getItem proxy in amp-script to become an async function that fetches data from actual LocalStorage. It will have to async because it’ll be crossing thread boundaries in JS (from worker to iframe to the main page and then back)
    1. How this helps: Any amp-script can now get whatever is the latest data for a given key from localStorage.
      1. We can keep our inline script as is.
      2. Customer will not need to change anything in their deployment
    2. Caveats:
      1. No longer has the same signature as the browser’s getItem implementation
      2. It might take a while for data to cross boundaries (worker + iframe + page)
  2. amp-ad gets direct access to LocalStorage to get data directly
    1. Takes away the middleman
    2. Can always query the latest values from LS
    3. Won’t need to play with waiting times
    4. Will need a mechanism to format data object according to what amp-ad expects

Launch Tracker

No response

Notifications

/cc @ampproject/wg-performance /cc @ampproject/wg-ads-reviewers

newmuis commented 2 years ago

cc @ampproject/wg-monetization @ampproject/wg-ads-reviewers

zshnr commented 2 years ago

After further discussions within our engineering team we have made some significant changes to this I2I, essentially changing what we're proposing but for the same purpose.

Looking forward to discussing this in two days time :) also happy to answer any queries before that as well.

I am available on AMP's Slack as well for any questions :)

Thanks!

newmuis commented 2 years ago

This was discussed in today's design review. Broadly, we discussed two solutions:

Solutions discussed

1. Extend the timeout

As proposed above, we can either simply increase the timeout for amp-script or we can have amp-script send a signal when it's ready and only start the existing 1000ms timeout once amp-script is ready.

Caveats:

Due to these caveats, we'd need to make these changes behind a document-level opt-in or origin trial, to be able to measure the effects of the changes and ensure that they do not cause ecosystem-wide regressions on the above metrics.

We would need to consider how we might measure the performance and revenue metrics during the implementation phase, as we would need to:

  1. Partner with publishers and/or ad networks for the revenue data (which they may or may not be willing to provide), and
  2. Would need to have the ability to slice revenue and performance metrics by experiment, which is not currently possible

This does also widen the API surface of amp-script overall, beyond this particular use case. We would need to evaluate what other ways the extended timeout could be used, misused, or abused that might need to be maintained longer-term.

2. Implement a 3p ad network

If this content is served in a third-party frame, the Permutive SDK is free to use whatever technologies it likes to implement functionality inside the frame.

Caveats:

Resources:

Conclusions

Solution 1 is the closest to what was originally proposed, but we can not definitively say that this can launch. If we were to prioritize and implement this work (which is not a foregone conclusion), we would still need an extended period of design, implementation, and experimentation, after which (depending on the metrics) we still may not be able to launch.

Solution 2 is supported with infrastructure that AMP has today, so it can be implemented at any time by Permutive. That said, it does come with the potential affects to publisher revenue caused by the third-party iframe.

Action Items

zshnr commented 1 year ago

Thanks again for your time and that of your team @newmuis.

Sorry for the radio silence, I have been on holiday and have just caught up with things after returning.

We are prioritising this work internally and will let you know soon which solution we want to explore.

Thanks!

zshnr commented 1 year ago

Hi @newmuis,

I was looking into scoping out work for implementing Permutive as an ad vendor. One question I have is:

If there are multiple <amp-ad type="permutive"> elements on the page loading ads, will each of them have their own iframe context?

If so, it raises a huge problem for us because our library will essentially be loaded individually with each ad whereas it is designed to be loaded once for the whole page.

Am I correct in assuming this will be the case? Is there a mechanism to share any resources between the amp-ad elements?

Thanks!