WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
511 stars 215 forks source link

Addition of an analytics/reporting entity to enable centralised reporting #1115

Open warrrrren opened 2 months ago

warrrrren commented 2 months ago

Context

Publishers today can rely on their SSP/Analytics partners to provide detailed reporting on several metrics and analytical tooling outside of the reporting provided by their ad server. e.g. Prebid supports analytics adapters that receive events notifying them of bid initiation, bidder adapter bids, timeouts, etc. allowing for data analysis in the analytics vendors interface. The current PA-API implementation doesn't allow for more than one entity (the top-level seller) to view and thus report on all the component and top-level auctions.

Proposal

The PA-API adds an analyticsURL key to the auction config that takes a URL to a hosted javascript file as a value. The hosted file can contain a function that is called when each participating component auction and the top-level auction's reportResult() function is called.

This function's input schema and restrictions can be similar to the existing reportResult().

warrrrren commented 2 months ago

Reference to the relevant section of the Chrome response to the IAB fit gap analysis: https://docs.google.com/document/d/10608Tp57alonCiBN9D2-0UfV_C6FFLsV5nh6sBaA5rA/preview#heading=h.uf1ddn79iizl

myonlinematters commented 2 months ago

Specific item on page is "Publisher Revenue Accrual and Impression Validation"

MattMenke2 commented 2 months ago

If we do this, we may need to introduce some way for auction participants to indicate they are OK sharing information with the publisher (for the bidder/seller relation, participating in the auction is considered implicit permission to share the bid, but this change is very much changing who can view information previously private in Protected Audiences with a 3rd party), and we'd need to extend the aggregate reporting to include the publisher.

I think we'd only want to call this new putative method once during the reporting phase, not once per reportResult() call? Also note that reportResult() is only called on the component seller with the winning bid (And the top-level seller), not for each of the other component sellers.

michaelkleber commented 2 months ago

Hi Warren, could you please add your name and affiliation to your GitHub profile?

I suspect we need a little more work to define what this reporting entity is trying to do and what information it should get. As Matt says, only the winning component auction gets to run reportResult() now. What kind of reporting goals are you looking for in multi-seller auctions? What about cases where the whole PA auction does not produce any winner at all? And crucially: Are some of these goals better met with more data but only aggregate reporting?

For the permissioning issue that Matt brings up, it seems to me that it's sufficient for the seller to be able to see the auction config (including any new reporting endpoint) in scoreAd. If the seller doesn't agree to disclose information to the publisher's choice of reporting endpoint, they can of course decline to have any winner from the auction. But I don't think the seller should be able to say "I will run this auction but refuse this reporting"; the publisher should be in the position of making it a take-it-or-leave-it kind of deal.

(We would need to decide what happens if the reporting endpoint fails enrollment and attestation: does that make the whole auction fail, or does the auction happen without the 3p reporting?)

I'm worried about a different permissioning question, though: does the presence of a reporting endpoint in the auction config constitute a good enough proof that that really is the publisher's intent? Warren is asking for "an analytics/reporting entity", so not necessarily something going to the domain of the publisher page itself. I don't think we want anyone who gets a chance to run JS on the publisher page to be able to install themselves as if they were a publisher-blessed reporting endpoint. Seems like we should ask for something like a .well-known file on the publisher domain that explicitly lists blessed reporting endpoints.

MattMenke2 commented 2 months ago

I'm skeptical that adding the information to the auctionConfig passed to the seller is enough here - I think it's reasonable to require only the seller provide consent (The seller already has that information, so can decide whether or not it's ok to pass it on), but we're basically passing information solely available able to scripts with their own memory spaces to another origin in a way that may come as a surprise to sellers. While I defer to more security-minded folks on whether we need explicit opt-in here, I think we do (just an extra "yes, it's ok to send info to this publisher origin" field would be enough, and we would ignore the bid, otherwise).

gpolaert commented 2 months ago

Publisher API

part 1: Publisher Monitoring API

I would like to share some ideas and thoughts on the current issue. I have attempted to make my writing brief, concise, and precise. My goal is to contribute meaningfully to the discussion. Unfortunately, it's a bit long.

TL;DR

This issue discusses the Protected Audience API (PAAPI), focusing on the needs of publishers and offering a solution for better transparency and control. The problem statements include lack of transparency, real-time detection, context/data, and control. The proposed solution is a two-part API: a Publisher Monitoring API for collecting and analyzing performance data while ensuring user privacy, and a Publisher Configuration API for setting rules related to monetization. The aim is to align with the privacy sandbox's objectives and set higher industry standards based on transparency and trust.

Overview

The Protected Audience API (PAAPI) is designed around two primary concepts: buyers and sellers. The term "seller" can represent various stakeholders or interests, such as an SSP, an ad server when acting as a top seller, or a property owner (i.e., the publisher).

The points that follow are focused solely on the publisher's needs and aim to provide additional context to the current GitHub issue.

⚠️ This issue doesn't focus on browser-side troubleshooting, nor is it a monitoring API for sellers or buyers, or an event-based JS API like Googletag's SlotRenderedEvent.

Problem statement

(1) Lack of transparency. There's no clear source-of-truth in the PAAPI that allows publishers to independently verify and control that everything functions as expected. The most common use cases include:

(2) Lack of real-time detection. For property owners, swiftly detecting issues and bugs in their current setup is crucial to avoid significant revenue losses. The most common use cases include:

(3) Lack of context/data. From a publisher's perspective, audiences are their most valuable assets — they are what publishers sell to the buy side.

(4) Lack of control. The publisher has responsibility for the content, user experience, and ads displayed to the users. Concurrently, there are growing CSR initiatives to limit and optimize the supply path. Publishers need appropriate controls and tools to ensure everything functions as intended (such as type of ads, allowed brand domains, floor price, etc.).

Additional context

(1) Sellers ≠ Publishers

It's important to understand that publishers and buyers, or sellers, including top sellers, have different business interests. Buyers aim to purchase the right inventory or audience at the lowest price. In contrast, sellers and publishers strive to sell their inventory at the highest price. Sellers compete with each other to capture a larger portion of the inventory and have no incentive to optimize the entire publisher ad stack because it doesn't align with their interests.

While publishers are worried about the impact of advertising on their page, both buyers and sellers are less concerned with overall page performance or SEO challenges.

(2) Privacy Sandbox + User Privacy + Publishers

*The Privacy Sandbox has two core aims:

  • Phase out support for third-party cookies when new solutions are in place.
  • Reduce cross-site and cross-app tracking while helping to keep online content and services free for all.*

At the heart of their operations, publishers produce content for audiences. Their business thrives on understanding and valuing these audiences. When a page on a specific site is loaded, it's crucial for publishers to comprehend their users and the associated performance metrics. This understanding informs various decisions, including monetization strategies and product adjustments (content, subscriptions, user experience, etc.). However, typically, the user's existence is not recognized beyond the site.

It's crucial to understand that protecting user privacy requires adaptation among buyers, sellers, and publishers.

Lastly, publishers tend to be permissive, often implementing minimal checks on third-party partners, which can lead to data leaks (social pixels, user analytics, last fencing technology publishers should test, etc.).

Why it’s an opportunity to provide a solution

What could the Privacy Sandbox offer?

💡 This section compiles ideas that could potentially form a solution. A limitation is my lack of knowledge about the PAAPI. I willingly admit that some use-cases described above might be addressed by the current APIs such as the private aggregation API and functions like reportWin or reportResult.

IMO, The issue can be split into 2 parts, which can be covered by an API

Some toughs about building a solution

Requirements and constraints

I'm summarizing the requirements and constraints that need to be addressed to ensure that the new API doesn't compromise the promises of the Privacy Sandbox:

Key capabilities

patmmccann commented 2 months ago

1035 was a less elegant attempt at this same feature request

warrrrren commented 2 months ago

Followup detailed proposal: https://docs.google.com/document/d/1dmtOXo1WAWmPXOoi0B1CR8RfqcAek0nU3kgcFyyDT8Y/edit?usp=sharing

gpolaert commented 1 month ago

Hello,

I am seeking feedback from @MattMenke2 and @michaelkleber regarding this issue, particularly to gain insight into what is feasible or what may not align with the PAAPI objectives.

While awaiting their input, I have begun sharing the issue with top publishers in the EU. If necessary, I can request for them to share their current challenges and concerns to gather additional feedback from the sell-side and publishers.

michaelkleber commented 1 month ago

Thank you for writing up your proposal. I look forward to talking about it on our weekly call, where this issue is the first item on the agenda!

rdgordon-index commented 1 month ago

Linking https://github.com/WICG/turtledove/issues/430 which is also related.

warrrrren commented 1 month ago

@michaelkleber and @gpolaert Following up on the conversation we had last week, here's a scoped-down version of the proposal that only calls for aggregated reporting and reserved fenced frame events: Aggregate-Only Proposal

Note: This proposal doesn't directly support the real-time monitoring use case.

end0cr1ne commented 1 month ago

@michaelkleber Following up on our discussion a couple of weeks ago, I've created an updated proposal that utilizes the shared storage API:Shared Storage Based Proposal

warrrrren commented 3 weeks ago

@michaelkleber and @JensenPaul Following up on our conversation this past Wednesday:

  1. I looked at #1190 and it doesn't seem strongly related to this issue.
  2. This is the updated proposal I mentioned: Shared Storage Based Proposal