WICG / privacy-preserving-ads

Privacy-Preserving Ads
Other
100 stars 20 forks source link

direct sold ads and header bidding #4

Closed pranay-prabhat closed 3 years ago

pranay-prabhat commented 3 years ago

Typically a publisher's page does header bidding to collect all bids from multiple SSPs --> the bids pass into the ad server --> ad server run competition between these bids and direct ads before the final ad is rendered on the page. Publisher page calling ad server typically passes 100s of key-value pairs including few like browser, viewport size, ad position etc. which are very specific to the publisher. How do we expect all of this to work in this proposal?

mehulparsana commented 3 years ago

This is somewhat similar to https://github.com/WICG/privacy-preserving-ads/issues/5#issue-822291741. Particularly, option 1 (floor based setup) allows publisher to fetch contextual ad as part of page load. PARAKEET flow is necessary if user features/interest groups are required for ad selection.

pranay-prabhat commented 3 years ago

So let me confirm if i understand option 1. As soon as page load --> pub passes contextual and first party data to ad server and fetches ads --> once round trip with ad server is complete and say there is a direct sold ad available --> call to Parakeet server is made with price floor determined from the direct ad --> Parakeet talks to ad-networks, collect their bids and compares if price floor from direct ad is better or lower than the Parakeet ad and returns the response to the page where final ad is eventually rendered.

I have a question then : -

Isn't there a timing attack issue where realtime call with full contextual, device, first party data made to the ad server and another realtime call at the same time reaching AdNetworks via the Parakeet service?

mehulparsana commented 3 years ago

The details you have shared is one of potential flow for floor based setup. It can be further optimize in terms of latency if publisher knows the floor upfront.

If I understand your question for timing attack correctly, time based linkage leverages <time window, IP, UA, publisher context> to link both requests. We have introduced two constructs in the flow to prevent this -

  1. Context anonymization for PARAKEET request - construct to reduce contextual information granularity and pass signals which passes anonymity test within a time epoch. User features will not be provided in the request if publisher context is too unique within in a time window. Privacy test for contextual signal will be ongoing test to prevent adversarial attacks.
  2. Proxying the ad request - construct will rotate IP and UA to reduce fingerprinting. We have explained how to preserve location and device semantics at some granularity.
pranay-prabhat commented 3 years ago

Here is a proposed flow -

  1. Publisher page calls their direct ad server and any other header bidding (using first party data) to collect direct ad as well as bids from header bidding partners
  2. Browser calls Parakeet service with a predefined floor provided by the publisher and collect winning bid from Parakeet. This Parakeet service call happens in parallel to direct ad server and header bidding calls above in Point 1.
  3. Publisher page should be allowed to run a javascript code in an event based way so that if publisher is happy with the ad from direct ad server or bids from header bidding partners, publisher will inform the browser to ignore any response from Parakeet or preemptively cancel Parakeet request if its still in process. This way the direct ad or header bidding ad renders in iframe/safe-frame the way it happens in current world.
  4. If publisher is not happy with response from direct ad server or header bidding bids, publisher informs browser the realtime floor value it obtained from the direct ad server / header bidding and browser renders Parakeet ad in a fenced frame if it beats the realtime floor OR Parakeet informs publisher that Parakeet ad is not that good and ultimately publisher renders the ad in a normal iframe/safeframe.

I propose that direct ad server call or any header bidding call happens in parallel to Parakeet call to save time to render an ad. Not doing so has serious issues in terms of ad rendering latency which is harmful to not just publishers but all the players responsible to show the ad to the user.

This also makes an assumption that Parakeet service response is in reasonable limits and at par with how header bidding latency or direct ad server latency works today.

peligio commented 3 years ago

Just adding a potential scenario here that there may be a scenario where a PARAKEET ad is more desirable for the publisher to run than the direct and header bidded ad from step 3 in the flow presented above. Not sure how realistic this is, but it could be a possibility.

Anyway, would the proposed javascript code on the publisher page be one applying an overall higher priority for direct/hb ads as a rule, or is it more like a timeout with the assumption PARAKEET ad response may take longer than desirable?

+1 for direct, header bidding and PARAKEET service call happening in parallel not only for latency reasons, but for completeness of decisioning for the publisher.

pranay-prabhat commented 3 years ago

Anyway, would the proposed javascript code on the publisher page be one applying an overall higher priority for direct/hb ads as a rule, or is it more like a timeout with the assumption PARAKEET ad response may take longer than desirable?

Proposed Javascript could do both. i.e. make a decision if direct/hb ad is good enough then inform browser to timeout Parakeet request (if it is still in flight) OR ignore the ad from Parakeet if it is back into the browser

pranay-prabhat commented 3 years ago

@KeldaAnders Will propose to add this in one the subsequent Parakeet calls. I think its important we all get in sync in terms of what flexibility we need on the client side (i.e. the browser) before the final ad is rendered on the page

mehulparsana commented 3 years ago

Proposing following flow for directsold/1P ad integration -

  1. Post rtbAds response from programmatic/rtb ads serving through PARAKEET and directSoldAds from publisher/SSP server + publisher.js posted in the Fenced Frame. rtbAds will follow standard RTB ad response format.
  2. publisher.js can score and finalize ads between [rtbAds, directSoldAds] using arbitrary logic without any network access
  3. After selecting one or more ads, publisher.js can call reportResult(finalizedAds) method, which will register for impressions with browser. We can consider additional rendering capabilities.

Additional details:

pranay-prabhat commented 3 years ago

@mehulparsana as discussed over call, i will like to drill down more on what happens when as per publisher.js directSoldAd is a winner.

It is ok for the logic itself to not access any server but a direct sold ad could make additional network calls for various reasons for both sell side and buy side. Rendering such ads in FencedFrame means its unclear how a lot of creatives will work which require additional network calls at the time of rendering, how notifications like ad-render/viewability etc. will be handled and overall what kind of impression reporting publishers will get for direct sold ads.

Current we get extensive log level data on direct sold ads which certainly we cannot afford to lose.

pranay-prabhat commented 3 years ago

Me and @darobin discussed and we still think rendering all ads under fenced-frame seems like a blanket solution which will push for major shifts on how ad creatives work. Creatives making additional network calls on user interaction, verification vendors, safe-frame api , streaming all of this will have to be considered which feels disruptive given the timeframe we have to make this solution work.

We understand that 1 bit info leaking on the browser based on the final auction between direct sold winner and Parakeet winner could lead to collusion risk in very rare scenario but will recommend that to be considered as problem to be solved rather than asking all possible ads to go into a fenced-frame.

erik-anderson commented 3 years ago

Thanks for the feedback. We understand the concerns and want to find solutions that meet your needs while retaining the privacy properties we’re aiming for.

There are two paths we want to look at in parallel:

  1. Expose to the publisher site a real-time signal about if the PARAKEET ad met a direct-sold floor. The service will need to evaluate how it can watch for flows that appear to be highly unique to a user and, in instances that look non-privacy-preserving, the browser may need to lie about the outcome (e.g. perhaps showing the below-the-floor ad or simply not filling the ad slot).

    Our expectation would be that good actors would not generally be impacted by such a check and the worst case scenario is limited. If we end up seeing evidence that it’s happening more frequently than expected, we can explore aggregate reporting to enable the ecosystem to detect and self-remediate in terms of adjusting their inputs into the system.

    We will loop back on this issue when we have a more formal write-up.

  2. Many of your concerns appear to be around general functionality that may be desirable even for ads rendered in a fenced frame. We would like to continue to look at those holistically so that we can explore new functionality to serve the same use cases where appropriate. Please continue to file issues for use cases that are not yet reasonably handled by these proposals so we can explore solutions for each of them.