patcg-individual-drafts / topics

The Topics API
https://patcg-individual-drafts.github.io/topics/
Other
620 stars 229 forks source link

Providing Buyers Access to their Topics #82

Open av-sherman opened 2 years ago

av-sherman commented 2 years ago

The current Topics API provides callers (third parties on the page) with Topics associated with previous pages or domains on which the API has been previously called by the given third party. This means that, broadly, SSPs can receive Topics associated with their publishers via their tagging on publisher websites, while DSPs can receive Topics associated with their advertisers from tagging on advertiser websites.

In an ecosystem with distinct roles of publisher platforms (SSPs) and advertiser platforms (DSPs), one way Topics API could be employed would be SSPs calling the Topics API during an ad request and passing their retrieved Topics to DSPs for bidding. In such a scenario, DSPs would be bidding based on the SSP’s Topics footprint (based on publisher websites) and not their own (based on advertiser websites).

However, DSPs may find increased commercial value in targeting and bidding using Topics observed on their partner advertiser websites’ footprint than from a seller’s publisher websites.

The current Topics API supports this use case, albeit awkwardly. A possible flow:

  1. Code running on a publisher’s page can create an iframe for a specific buyer
  2. Within the iframe, the buyer calls the Topics API
  3. The buyer can postmessage the Topics result back to the publisher
  4. The code running on the publisher’s site can send the buyer’s Topics information on an ad request to an SSP
  5. The SSP can forward the Topics signals to the given buyer.

Likely, publishers will delegate their role to SSP tagging code, which could potentially coordinate retrieving Topics on behalf of multiple buyers interested in bidding using their own Topics.

Notably, depending on the collaboration between DSPs and SSPs, this might provide buyers with access to additional Topics than they may otherwise receive from an SSP – in particular, if a singular call on behalf of an SSP’s origin would have returned fewer than three Topics.

Further, SSPs or publishers might want a mechanism to stop buyers from contributing to the set of observed Topics from the publisher’s page view in order to not expand the buyer’s footprint beyond what it already is. Issue #54, which offers splitting Topics retrieval from Topics attribution, in conjunction with permission policy headers could be one method of achieving this.

Some areas for discussion:

  1. Do DSPs find this flow of accessing topics they observed (on sites where they have third-party presence, such as on advertiser websites) via their supply partners (SSPs and exchanges) potentially attractive and valuable?
  2. Are there any incremental privacy risks/concerns in facilitating buyers accessing Topics they observed from a context they would not otherwise have presence?
  3. Using iframes for this (or potentially headers, see issue #7) is awkward and would incur nontrivial latency overhead client-side – perhaps the API could support some more direct access flow, possibly via a JavaScript API call gated by some .well-known file that would indicate DSP-SSP collaboration intent.
dmarti commented 2 years ago

Another related issue is https://github.com/patcg-individual-drafts/topics/issues/73. Both SSP and DSP could obtain Topics from a caller that is present on more sites than either one is (or Topics API may help to encourage consolidation of callers)

dmarti commented 2 years ago

For a partial answer to question 2, see https://github.com/patcg-individual-drafts/topics/issues/11. In practice, topics data is likely to pass among SSPs, DSPs, and other third parties.

michaelkleber commented 2 years ago

Thanks for filing this issue. As Don says, I think this has a lot of overlap with the observations in Lionel's #73.

I wonder what folks would think of replacing your "albeit awkwardly" workflow with a flow where the DSP-SSP cooperation happens on the buy-side instead of the sell-side version that you describe above. Advertisers have one or more DSPs that do buying on their behalf, and those DSPs each have various SSPs whose inventory they buy. So here's a very natural flow:

  1. DSP code running on the advertiser's page can create an iframe for each SSP that the DSP buys on.
  2. Within that iframe, the SSP calls Topics API.
  3. Later, on a publisher page, the SSP calls Topics themselves, and includes the result in a bid request.

Let's call this kind of operation "Topics-matching", since it is reminiscent in some ways of "cookie-matching". In a sense I'm proposing Topics-matching on advertiser pages, while you described it happening on publisher pages.

Whether Topics-matching happens on publisher or advertiser pages, it retains the key privacy goal that led to per-caller filtering: The only way for a 3p to get access to some topic is for that 3p to be invited onto a page about that topic. If the ad tech community decides some party is a bad actor, they can just stop inviting them onto pages, and that party won't get access to the data any more.

But I think advertiser-pages approach offers lots of advantages. Now there is no need for postMessaging around within pages, and all of the cross-actor coordination happens off the critical path, in advance of the publisher page visit. Moreover there are many fewer iframes here than in the publisher-page version, because there are many fewer SSPs than DSPs. (To make things even more efficient, each DSP could include a particular SSP iframe only once a day — all that matters is that it happens at least once during the weekly Topic calculation epoch.)

I think advertiser-side Topics matching also does a better job with the "integrate a new DSP" worry that Lionel brought up in https://github.com/patcg-individual-drafts/topics/issues/73#issuecomment-1166982969. If a new small DSP shows up in the ad tech world, they already need to make deals with various SSPs, to get opportunities to buy; part of that deal could include the DSP include the SSP's Topics-matching iframe. And if new DSP's advertisers were already working with some existing DSPs that bought on those SSPs, then the new DSP starts off with access to all the signals they would want, out of the gate; it's only if they bring that advertiser to a new SSP that they would need to wait through a three-epoch ramp-up period to get full signal.

AramZS commented 2 years ago

@michaelkleber I think this is an interesting way to create a flow that has an easier time for new entrants and I appreciate that. But I also think this sounds like a lot of network requests and frames etc...! I saw there's some other conversations about ways to handle potential performance impacts of the 'include HTML and/or JS' problems that Topcis raises and I think it would be really important to solidify that before going down a route where publisher pages end up with not just SSP code, but also DSP+SSP combinations for each possible combination. The scale of DSP participants in the system for most publisher would create a LOT of events and code and I think a pretty big performance impact.

dmarti commented 2 years ago

@AramZS It seems like the one "best" caller on the page (whatever 3rd party has an iframe likely to also be present on the most sites likely to have been visited by the same user) could do the actual Topics API call, and then post it to JS on the containing page that could populate a user data object that is available to other parties. (DIscussion at the Prebid project: https://github.com/prebid/Prebid.js/issues/7968 )

AramZS commented 2 years ago

@dmarti if we're just delegating prebid's (or some other bidder wrapper's) domain to handle all the gathering of Topics for all the websites... isn't the access just as broad as FLoC's then in terms of wide identical access... only now publisher domains are cut out?

AramZS commented 2 years ago

I mean, what's the point of restricting it to domains that have seen the user if functionally the practice will be to just make sure a single domains sees every user everywhere?

dmarti commented 2 years ago

@AramZS That's a good question. Issue #11 covers the Topics syncing complexity problem. Sites might test Topics syncing among multiple callers at first, and then, based on testing, eliminate the less useful callers. After a caller is eliminated from some sites, other sites will get fewer good Topics from it too, then also eliminate it. Eventually all the sites will get down to the one caller that is best at returning high-value Topics in a high-performance way.

dmarti commented 2 years ago

@michaelkleber For many advertisers, users rarely visit the advertiser's site. ("got milk?"). Advertisers who don't expect web visits are still going to look for ways to get access to more and better Topics.