privacycg / private-click-measurement

Private Click Measurement
https://privacycg.github.io/private-click-measurement/
200 stars 8 forks source link

Concerns from adserver perspective #20

Closed janwinkler closed 4 years ago

janwinkler commented 5 years ago

Hello everyone,

as an adserving company we develop software which allows our clients to track conversions. We understand that Privacy is important and therefore like to contribute to this initiative here. Anyhow, the proposal seems to not cover the current market situation on how tracking is done. Here are the main issues we see with the proposal:

  1. In the example it is outlined that a website would trigger the click to a shop and the browser would later report the conversion back to the website. That is not how the advertising industry works. In a normal advertising scenario there will be a construct like:

    • Website implements adcode of its adserver
    • AdServer delivers creative code from SSP
    • SSP delivers creative code from DSP
    • DSP delivers creative code from Agency Adserver
    • Agency AdServer delivers creative code from advertiser adserver
    • Advertiser adserver delivers the actual ad In short: In order to display an ad there will always be (much) more than only two parties involved. In most cases, the website itself doesnt care about the conversions happening. In most cases it will be the agency and the advertiser, sometimes the dsp, who want to track the conversion. Hence: The attribution needs to be able to cover more than one conversion targets.
  2. In order to track a conversion there is always at least 2 information necessary: a) where did the click happen (the placement on the website, e.g. if its a banner on the top of the page, left side, right side, native ad, ...) and b) on which creative was clicked (in most cases a campaign would include many creatives). In most cases there will even be a third identifier necessary (e.g. for the advertiser to identify which website the click was on). Hence: Having only one campaign identifier is not sufficient for tracking. As it would only tell which campaign triggered the conversion, but not which creative, which placement or which website.

  3. In most cases adtech companies work with incrementing numbers for all identifiers. this means that a normal adtech solution uses IDs that are much higher than 64. limiting the identifyer to 64 is not sufficient for tracking. We recommend to increase the number to at least 65k (16 bit). We understand that numbers this high are a potential privacy risk and are open to discuss lower numbers. Anyhow, a limitation to 6bits is tooooo little for any tracking logic as most adtech companies deal with thousands of live campaigns, websites, placements and creatives.

  4. Conversion tracking is not only limited to clicks but also to impressions. It is important to be able to track if a user that saw an ad that converted in a sale (although he did not click on the ad). In many cases we see that users see an ad and will then later search for the brand name and buy something. Advertisers need to be able to attribute this sale to the impression.

  5. Using HTML attributes seems inflexible and involves many changes on publisher side. At the same time it does not allow for click conversions via javascript and multiple redirects.

  6. In order to prevend conversion-fraud it is necessary for the advertiser to report order-numbers or similar identifiers to its reporting system so it can later check which orders were cancelled by the buyer and which were paid.

  7. in some cases it is essential for an advertiser to count only one conversion - in other cases it is essential to count all conversions of the same user. hence it should not be up to the browser to decide which conversion is fired and which not.

  8. it is essential for the advertiser to understand what kind of conversion happend. e.g. if it was a sale, a submit of a form or any other event. therefore the conversion needs to get a "type" or some kind of identifier telling the advertiser about the conversion itself.

Our proposal: In order to cover the above issues, we propose to go a different direction:

The advantages of this proposal is, that it would be fully compatible with all browsers, whether they support it or not: A browser that does not support this feature, will still fire the normal pixels and tracking will still work as before. a browser that supports the feature will block the tracking pixels and use the privacy-tracking logic.

additional privacy considerations: in order to prevent fingerprinting, we recommend to limit the use of subdomains: An adserver could use the subdomain on a short term conversion to transfer user identification e.g. if the click url includes the domain user8122531.ads.adserver.com. on the conversion page, the adserver would typically use a script in order to write the tracking pixel into the page. the adserver could associate the ip of the user to the domain and the script could now write a tracking pixel with the same domain into the page. a later call to this tracking pixel would still reveal the user's full ID and therefore must be blocked to keep the privacy. hence we recommend to only allow a certain amount of characters per subdomain or even only whitelist certain subdomains (in this case ads.adserver.com)

johnwilander commented 5 years ago

Hi and thanks for filing your concerns!

First, there are reasons for not sending attribution data to third parties:

Second, addressing some of the things you bring up:

janwinkler commented 5 years ago

@johnwilander While I understand most of your points from a data protection perspective, I dont see that developing a mechnism that is designed "too hard/strict/inflexible", will get any market adoption. Instead marketers will search and find other ways to be able to track the same data they are already tracking (e.g. server side first party tracking which browsers are not able to block). Trying to change a multi billion dollar industry can only work, if the alternative still provides the same minimum of flexibility/features that marketers already have. The ONLY way to get to more privacy is to enable marketers with tools they can use to get the same/similar result as before but with benefit of "integrated" data protection.

Regarding the things you mention:

  1. User perspective: Yes the user does not understand that there are 50 other parties on the website. But if these parties do not process any personal data, the user doesn't care about them. The only concern for the user is when his/her personal data is processed and that can be limited.

  2. First party control: As written before, that is not how it works. Several reasons: Besides the really big players (Google, Facebook, ...) NO publisher has a direct contact to its advertisers - there is always some third party inbetween. Just imagine the biggest advertising network Google Adsense as an example: a) It is just not possible for every publisher to know every advertiser (there are too many advertisers). b) Same for advertisers: it is just not possible for an advertiser to work with publishers direktly (there are too many publishers) c) It is not practical for a publisher to have direct links to an advertiser (publishers want ads to rotate, campaigns to start/stop automatically, apply volume and frequency capping etc). there is always a need for some tech inbetween (adserver, ssp, adnetwork, ...) Hence thrid party tracking is essential.

  3. More identifiers / 12 bits / Entropy / HTTP-Headers: All basically touch the same issue, "how much data is needed vs how much data is possible without beeing able to find out which user it was". As written before, 12 bits are not sufficient for tracking the necessary data and would therefore bring marketers to use other ways to get the same old data. As a bare minimum we see 2x12 bits allowing marketers to have 4k active creatives on 4k active placements.

  4. "The browser will block the pixel on the site" ā€“ how is this possible without collateral real image and ad blocking? How can the browser know what is a tracking pixel and what is an image without making the request? --> The is set with the html-attribut ad-tracking-campaign="...". Hence the browser does not need to fire the pixel but only take the content of the attributes and save it along with the domain.

  5. Subdomain: Unfortunately I cant find anything about it. Can you point me at the corresponding section?

Best regards, Jan

johnwilander commented 4 years ago

Thank you for sharing your thoughts! I'm sorry I didn't get back to you earlier. However, this issue touches on a large number of concerns. Please file individual concerns for consideration.