WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
526 stars 229 forks source link

Preventing multiple automatic beacons from being sent out #743

Open blu25 opened 1 year ago

blu25 commented 1 year ago

Ad frames can have multiple links in it, where only a subset of the full links require an automatic beacon to be sent out. Right now, automatic beacons don't distinguish between what was clicked. If a navigation originates in a frame, a beacon will be sent out.

This is problematic for ads that have a "why this ad?" feature. Those should not be counted as a reportable click, but right now they are.

blu25 commented 1 year ago

Our current proposal to solve this involves introducing a new parameter to window.fence.setReportEventDataForAutomaticBeacons(), called once. If set to true, this will clear out the automatic beacon after the next top-level navigation that results in a beacon to send.

This can be used in a click handler for a link, which will effectively only send out the beacon for the navigation that results from that click.

keweigegoog commented 1 year ago

The once proposal requires the setReportEventDataForAutomaticBeacons() API to be called in a click handler. However this seems to be a problem for 3PAS (Third-party ad serving) ads, since we don't have control over click handlers.

We propose to add another new parameter to setReportEventDataForAutomaticBeacons(), called domains_allowed. The parameter specifies a list of domains to allow automatic beacons to be sent out when their registered URLs are from the domain allowlist.

The new parameter can work w/ or w/o the once parameter. If once is set to true, the automatic beacon will still only be called once.

blu25 commented 1 year ago

I have a few clarifying questions:

When the list is non-empty, only URLs with allowed domains registered in registerAdBeacon('reserved.top_navigation': 'https://adtech.example/click?buyer_event_id=123', ...) will be sent out when the next top-level navigation happens.

Can you clarify what the allowlist is for? Are you proposing that the domain allowlist be for the recipients of automatic beacons? Or that they be for the URLs that a top-level navigation is navigating to?

Do you anticipate that the pages you embed in your 3PAS ads will want to call setReportEventDataForAutomaticBeacons()? If they do that in addition to your setReportEventDataForAutomaticBeacons() call, which ever of the calls comes second will overwrite the data set in the first call.

What exactly is the setup for the 3PAS ad? When an ad auction results in a 3PAS ad being displayed, what URL does the config object map to? Is the 3PAS ad itself cross-origin to the frame that loads the ad? Is it its own subframe? I'm asking this because we have a restriction that reserved.top_navigation beacons can only be sent if the frame that initiates navigation is same-origin with FencedFrameConfig's mapped URL. If the navigation initiates from a subframe that's cross origin to the mapped URL, no beacon will be sent.

keweigegoog commented 1 year ago

Can you clarify what the allowlist is for? Are you proposing that the domain allowlist be for the recipients of automatic beacons? Or that they be for the URLs that a top-level navigation is navigating

The allowlist was proposed to restrict domains on the URLs that a top level navigation is navigating to.

After a second thought, we realize that allowlist isn’t able to fully solve our problem because we do not always know the redirect URLs for 3PAS ads. We propose to add a reverse list domains_denied param - that skips sending out automatic beacons when the URLs of top-level navigation contain any domains in the deny list.

Do you anticipate that the pages you embed in your 3PAS ads will want to call setReportEventDataForAutomaticBeacons()? If they do that in addition to your setReportEventDataForAutomaticBeacons() call, which ever of the calls comes second will overwrite the data set in the first call.

No, we do not anticipate that third-party ad techs will need to call setReportEventDataForAutomaticBeacons() in their scripts, and we are aware of the data overridden issue.

What exactly is the setup for the 3PAS ad? When an ad auction results in a 3PAS ad being displayed, what URL does the config object map to? Is the 3PAS ad itself cross-origin to the frame that loads the ad? Is it its own subframe?

The 3PAS ad would be rendered in the same frame, i.e., cross-origin to the frame that loads the ad.

blu25 commented 1 year ago

We propose to add a reverse list domains_denied param...

We have a few concerns with this approach. If we add domains_denied, we would also want to add domains_allowed, but that feature will most likely end up being unused. We're also worried about adding clutter to the FenceEvent interface for corner cases; we want to keep the interface as minimal as possible.

The 3PAS ad would be rendered in the same frame, i.e., cross-origin to the frame that loads the ad.

When you say they're rendered in the same frame, do the contents of the 3PAS ad still live side-by-side in the same frame as your code that calls window.fence.setReport...()? Or does your code that calls window.fence.setReport...() also embed the 3PAS ad in a new child iframe?

Also, where does the "why this ad" button live in relation to your code? How much control do you have over that button? Does it live in the same frame/origin as your code? That will determine what solutions, if any, we can come up with.

If everything lives in the same frame like this:

Main frame that runs ad auction (a.com)
└── Ad loaded with a FencedFrameConfig that calls setReport..() (b.com)
    + Contents of 3PAS ad (b.com)
    + Contents of "why this ad" button (b.com)

then we might be able to append a new click handler for the "why this ad button" (targeting the id of the link) that clears out a previously set automatic beacon data. Then, after a predetermined amount of time, have it set the data again so that a click on the ad itself will result in a beacon being sent.

If the why this ad button/ad contents live in a different frame like this:

Main frame that runs ad auction (a.com)
└── Ad loaded with a FencedFrameConfig that calls setReport..() (b.com)
    ├── 3PAS ad in an iframe (c.com)
    └── "why this ad" button in an iframe (d.com)

then automatic beacons won't send on navigation. The frame that originates the navigation (c.com) is cross-origin to the initial rendered ad document (b.com), so it won't be allowed to send an automatic beacon.

If your setup is like the former tree, could the solution mentioned there work for what your team needs?

shivanigithub commented 1 year ago
Main frame that runs ad auction (a.com)
└── Ad loaded with a FencedFrameConfig that calls setReport..() (b.com)
    + Contents of 3PAS ad (b.com)
    + Contents of "why this ad" button (b.com)

then we might be able to append a new click handler for the "why this ad button" (targeting the id of the link) that clears out a previously set automatic beacon data. Then, after a predetermined amount of time, have it set the data again so that a click on the ad itself will result in a beacon being sent.

Liam: to clarify, this doesn't need to be a change in API, right? do you mean, the click handler code for "Why this ad" could invoke setReportEventDataForAutomaticBeacons with an empty destination? Also, the part about resetting it after a predetermined period of time will be indeterministic so that might not work.

blu25 commented 1 year ago

to clarify, this doesn't need to be a change in API, right?

Correct. This can be done with the current API shape.

the part about resetting it after a predetermined period of time will be indeterministic

As an alternative, you might be able to add a click listener for the whole document, and then based on what was clicked choose to set the automatic beacon data or not.

document.addEventListener('click', function(event) {
  if (event.target.id == "whythisad_button") {
    window.fence.setReportEventDataForAutomaticBeacons(*empty event*);
  } else {
    window.fence.setReportEventDataForAutomaticBeacons(*actual event*);
  }
});