disconnectme / disconnect-tracking-protection

Canonical repository for the Disconnect services file
Other
648 stars 221 forks source link

Consider removing OpenX as a Fingerprinter #137

Closed JoelPM closed 4 years ago

JoelPM commented 4 years ago

Please consider removing OpenX as a Fingerprinter.

https://github.com/disconnectme/disconnect-tracking-protection/blob/master/descriptions.md#OpenX

The justification references the /tq/pi beacon that we were using. That script was a traffic quality (tq) post-impression (pi) beacon that was added two years ago in order to help us authenticate that an impression was generated by the same user as the request came from. That beacon was only deployed on a sample of the traffic (random uniform) and the data was never used to generate a fingerprint. In fact the data was never used at all, so we have disabled that beacon and it should no longer get deployed on any traffic.

For what it's worth, the justification also references the /ri beacon, which is the "record impression" beacon. It actually happens before the (no longer in use) tq/pi beacon and the parameters are "ph" which stands for platform hash and is not user related at all (it's how we do multi-tenancy) and "ts" which is data specific to the ad request and doesn't include a computed fingerprint.

Thanks, Joel

ryanbr commented 4 years ago

This is still tracking/fingerprinting, note the resolution checks.

As seen on https://www.wp.pl/

https://wirtualn-d.openx.net/w/1.0/arj?ju=https%3A%2F%2Fwww.wp.pl%2F&ch=UTF-8&res=2560x1440x24&ifr=false&tz=-780&tws=2560x1301&be=1&bc=hb_pb_3.0.1&dddid=10667fe1-dae1-4b6e-b5a3-69e72d3c4c0b%2Cedb15289-8ef4-4ac1-b977-0e611182294c%2C2121d2cc-e538-4754-be5c-ebcca4cb7ca3%2C40534dd1-d050-44fb-8379-659dbcae809c&nocache=1583918779462&gdpr_consent=BOtaEGuOtaE2vBIABCPLC3-AAAAthr_7__7-_9_-_f__9uj3Or_v_f__30ccL59v_h_7v-_7fi_20nV4u_1vft9yfk1-5ctDztp205iakivHmqNeb9v_mz1_5pRP72k89r7337Ew_v8_v-b7JCON_Ig&gdpr=1&x_gdpr_f=1&pubcid=298a1e03-0365-4bce-b28c-2555d4162a0c&aus=300x250%7C300x600%2C300x250%7C750x200%2C940x200%7C300x250&divIds=slot11%2Cslot91%2Cslot62%2Cslot4&auid=538653917%2C540212188%2C540593450%2C540625842&

JoelPM commented 4 years ago

@ryanbr, thanks for your response. What's your definition of fingerprinting? My definition is using attributes of a client to identity them persistently (and with statistical significance) across sites. Resolution can be used to do fingerprinting, but getting screen resolution doesn't necessarily imply fingerprinting.

I'm not disputing our classification as a tracker. We do track using our cookie, for the purpose of helping monetize content on the web. I AM disputing our classification as a fingerprinter - which we do NOT do.

To truly identify a fingerprinter you'd have to show that they were able to uniquely identify the same entity across sites in the absence of 3p cookies. Disconnect.me is not doing anything nearly so sophisticated (near as I can tell). They are simply looking at what attributes are passed and the code that's available and doing their best based on that. Unfortunately their best isn't good enough and they're getting it wrong in some cases.

ryanbr commented 4 years ago

Would identifying a display resolution by the users browser is not fingerprinting them ? That is unique to the end user.

JoelPM commented 4 years ago

Hi @ryanbr,

Screen resolution alone is not enough to uniquely identify a user. If you haven't seen it already, I recommend checking out Panopticlick by the EFF. It provides an analysis of the uniqueness of different attributes of a client.

Also, as I noted above, looking at attributes of the client (to help prevent fraud and/or provide context in the ad request) does not imply fingerprinting. Buying Sudafed (pseudoephedrine) doesn't automatically mean a person is manufacturing methamphetamine - it's more likely they just have a cold.

Joel

liamengland1 commented 4 years ago

But can't that specific combination of attributes be used to identify/fingerprint?

JoelPM commented 4 years ago

@llacb47 - which specific combination of attributes are you talking about?

liamengland1 commented 4 years ago

Any client attributes that you (might) send to your server for anti-fraud purposes... such as screen size, time zone, user agent, user fonts, user language, screen orientation, etc.

JoelPM commented 4 years ago

@llacb47, if a website collects enough specific attributes about the client, it is possible to construct a fingerprint that is unique. As noted above, Panopticlick is a good source for understanding the attributes that would be needed and how each contributes to the overall uniqueness. I'm not disputing that fingerprinting is possible.

The attributes we collect are for the purpose of preventing fraud and communicating context to digital advertisers who may wish to show an ad. That being said, I don't think we collect enough attributes to uniquely identify someone.

I opened this issue because OpenX is not fingerprinting users and we are therefore incorrectly categorized.

disconnectme commented 4 years ago

Temporarily removed based on removal of fingerprinting code. https://github.com/disconnectme/disconnect-tracking-protection/commit/45154a89b88f5057ac973637043afe6320b7bf66