HTTPArchive / custom-metrics

Custom metrics to use with WebPageTest agents
Apache License 2.0
19 stars 22 forks source link

Privacy 2024: fingerprinting test #121

Closed bstandaert-wustl closed 3 months ago

bstandaert-wustl commented 3 months ago

Implements a test that looks for fingerprinting API strings in response bodies. Strings are manually derived from fingerprintJS.

It reports a count of API usages for each site, and a list of "likely fingerprinting scripts", which is anything with >= 5 API usages. @umariqbal do you have any recommendations on what this threshold should be?


Test websites:

UmarIqbal commented 3 months ago

What is the goal behind measuring the number of APIs used in the script. If this is for detecting fingerprinters, then I think we settled on the Disconnect's list, no?

Or is the goal to extend the detections by Disconnect by using the API presence threshold as a heuristic? If that is the case, then we can also look at the combination of APIs to pinpoint the exact fingerprinting method, e.g., by extending the heuristics proposed in this paper: https://www.cs.princeton.edu/~arvindn/publications/OpenWPM_1_million_site_tracking_measurement.pdf

Here are some combinations for prominent fingerprinting methods:

For CanvasFont:

CanvasRenderingContext2D.font & CanvasRenderingContext2D.measureText

For Canvas:

(CanvasRenderingContext2D.fillText | CanvasRenderingContext2D.strokeText | CanvasRenderingContext2D.fillStyle | CanvasRenderingContext2D.strokeStyle) & HTMLCanvasElement.toDataURL

No calls to following:
CanvasRenderingContext2D.save
CanvasRenderingContext2D.restore
HTMLCanvasElement.addEventListener

AudioContext:

OfflineAudioContext.createOscillator | OfflineAudioContext.createDynamicsCompressor | OfflineAudioContext.destination |     OfflineAudioContext.startRendering | OfflineAudioContext.oncomplete

WebRTC:

(RTCPeerConnection.createDataChannel | RTCPeerConnection.createOffer) & (RTCPeerConnection.onicecandidate | RTCPeerConnection.localDescription )

The other fingerprinting analysis is to look at the prevalence of known FP APIs, for which relying on APIs used by FPJS2 makes sense.

bstandaert-wustl commented 3 months ago

Or is the goal to extend the detections by Disconnect by using the API presence threshold as a heuristic

Yes - the Disconnect list only shows known fingerprinters, this would let us find unknown ones as well.

The other fingerprinting analysis is to look at the prevalence of known FP APIs, for which relying on APIs used by FPJS2 makes sense.

The idea is to extend this and ask, "what are the scripts using these APIs for fingerprinting" and their prevalence, similar to how we identify tracking network prevalence. Many of the usages of these are for legitimate non-fingerprinting purposes, but a script that uses many of them across different categories is more likely to be fingerprinting, which is where the threshold comes in.

The four you list make sense as a separate analysis, but don't cover the many other methods used by FPJS.

bstandaert-wustl commented 3 months ago

@SaptakS Nurullah mentioned you were managing PRs - are you able to do a review on this?

tunetheweb commented 3 months ago

Can you resolve the merge conflicts @bstandaert-wustl ?

bstandaert-wustl commented 3 months ago

Done.

tunetheweb commented 3 months ago

The latest tests show _null for both?

tunetheweb commented 3 months ago

OK the reason the custom metric is returning null is not to do with the changes in this PR but due to the existing a11y custom metric failing for https://fingerprintjs.github.io/fingerprintjs/](https://fingerprintjs.github.io/fingerprintjs/

@pmeenan is working on a fix.

github-actions[bot] commented 3 months ago
Custom metrics for https://almanac.httparchive.org/en/2022/ WPT test run results: http://webpagetest.httparchive.org/results.php?test=240612_PQ_T
Custom metrics for https://fingerprintjs.github.io/fingerprintjs/ WPT test run results: http://webpagetest.httparchive.org/results.php?test=240612_26_V Changed custom metrics values: ```json { "_privacy": { "privacy_wording_links": [], "iab_tcf_v1": { "present": false, "data": null, "compliant_setup": null }, "iab_tcf_v2": { "present": false, "data": null, "compliant_setup": null }, "iab_usp": { "present": false, "privacy_string": null }, "navigator_doNotTrack": false, "navigator_globalPrivacyControl": false, "document_permissionsPolicy": false, "document_featurePolicy": false, "referrerPolicy": { "entire_document_policy": null, "individual_requests": null, "link_relations": null }, "media_devices": { "navigator_mediaDevices_enumerateDevices": false, "navigator_mediaDevices_getUserMedia": false, "navigator_mediaDevices_getDisplayMedia": false }, "geolocation": { "navigator_geolocation_getCurrentPosition": false, "navigator_geolocation_watchPosition": false }, "fingerprinting": { "counts": { "getchanneldata": 1, "todataurl": 2, "prefers-contrast": 1, "forced-colors": 1, "dynamic-range": 1, "indexeddb": 1, "inverted-colors": 1, "navigator.oscpu": 1, "prefers-reduced-motion": 1, "prefers-reduced-transparency": 1, "availwidth": 1, "availheight": 1, "maxtouchpoints": 1, "webgl_debug_renderer_info": 1, "getshaderprecisionformat": 1 }, "likelyFingerprintingScripts": [ "https://fingerprintjs.github.io/fingerprintjs/main.js?f495f35ba65f05e50da4" ] }, "request_hostnames_with_cname": {}, "ccpa_link": { "hasCCPALink": false, "CCPALinkPhrases": [] } } } ```
Custom metrics for https://nytimes.com WPT test run results: http://webpagetest.httparchive.org/results.php?test=240612_EA_W Changed custom metrics values: ```json { "_privacy": { "privacy_wording_links": [ { "text": "Privacy Policy" } ], "iab_tcf_v1": { "present": false, "data": null, "compliant_setup": null }, "iab_tcf_v2": { "present": false, "data": null, "compliant_setup": null }, "iab_usp": { "present": false, "privacy_string": null }, "navigator_doNotTrack": true, "navigator_globalPrivacyControl": true, "document_permissionsPolicy": false, "document_featurePolicy": true, "referrerPolicy": { "entire_document_policy": null, "individual_requests": null, "link_relations": null }, "media_devices": { "navigator_mediaDevices_enumerateDevices": false, "navigator_mediaDevices_getUserMedia": false, "navigator_mediaDevices_getDisplayMedia": false }, "geolocation": { "navigator_geolocation_getCurrentPosition": false, "navigator_geolocation_watchPosition": false }, "fingerprinting": { "counts": { "indexeddb": 6, "localstorage": 242, "opendatabase": 3, "gettimezoneoffset": 18, "navigator.vendor": 10, "todataurl": 1, "navigator.platform": 2, "prefers-reduced-motion": 9, "ontouchstart": 3, "forced-colors": 1, "resolvedoptions().timezone": 4, "maxtouchpoints": 2, "navigator.language": 4, "navigator.userlanguage": 2, "sessionstorage": 37, "screen.width": 9, "screen.height": 9, "devicememory": 1, "availwidth": 1, "availheight": 1 }, "likelyFingerprintingScripts": [ "https://www.nytimes.com/", "https://www.datadoghq-browser-agent.com/us1/v5/datadog-rum.js", "https://www.nytimes.com/vi-assets/static-assets/home-2b223b859fe3ba339ee6.js", "https://www.nytimes.com/vi-assets/static-assets/main-0ac8d87d0abf4c43ba1a.js", "https://www.googletagmanager.com/gtm.js?id=GTM-P528B3>m_auth=tfAzqo1rYDLgYhmTnSjPqw>m_preview=env-130>m_cookies_win=x", "https://c.amazon-adsystem.com/aax2/apstag.js", "https://www.nytimes.com/ads/prebid8.25.0.js", "https://securepubads.g.doubleclick.net/pagead/managed/js/gpt/m202406120101/pubads_impl.js?cb=31084553", "https://launchpad.privacymanager.io/latest/launchpad.bundle.js", "https://ads.pubmatic.com/AdServer/js/user_sync.html?p=156011&s=165626&predirect=https%3A%2F%2Fs.amazon-adsystem.com%2Fecm3%3Fex%3Dpubmatic.com%26id%3DPM_UID", "https://eus.rubiconproject.com/usync.js", "https://accounts.google.com/gsi/client", "https://cdn.brandmetrics.com/scripts/bundle/65568.js?sid=4486dfe2-780e-4dfa-a60a-2a948887658f&toploc=www.nytimes.com", "https://ats-wrapper.privacymanager.io/ats-modules/087423d8-0b74-47f4-a79b-03ddad419681/ats.js", "https://www.google.com/recaptcha/api2/aframe" ] }, "request_hostnames_with_cname": { "www.nytimes.com": [ "nytimes.map.fastly.net" ], "g1.nyt.com": [ "nytimes.map.fastly.net" ], "samizdat-graphql.nytimes.com": [ "nytimes.map.fastly.net" ], "a.et.nytimes.com": [ "region-selector.nyti.nytimes.com" ], "als-svc.nytimes.com": [ "region-selector.nyti.nytimes.com" ], "rumcdn.geoedge.be": [ "d1bqktvj79b0wh.cloudfront.net" ], "c.amazon-adsystem.com": [ "d1ykf07e75w7ss.cloudfront.net" ], "static01.nytimes.com": [ "nytimes.map.fastly.net" ], "static01.nyt.com": [ "nytimes.map.fastly.net" ], "aax.amazon-adsystem.com": [ "d1jvc9b8z3vcjs.cloudfront.net" ], "dd.nytimes.com": [ "detpi4b5pore2.cloudfront.net" ], "hbopenbid.pubmatic.com": [ "gob-pairb-nje1.pubmnet.com" ], "tlx.3lift.com": [ "us-east-tlx.3lift.com" ], "ib.adnxs.com": [ "ib.anycast.adnxs.com" ], "fastlane.rubiconproject.com": [ "tagged-by.rubiconproject.net.akadns.net" ], "purr.nytimes.com": [ "region-selector.nyti.nytimes.com" ], "a.nytimes.com": [ "region-selector.nyti.nytimes.com" ], "ads.pubmatic.com": [ "e6603.g.akamaiedge.net" ], "eus.rubiconproject.com": [ "e8960.b.akamaiedge.net" ], "83d8835bb9dd0ae7681fc4e317560c89.safeframe.googlesyndication.com": [ "pagead-googlehosted.l.google.com" ], "ads.stickyadstv.com": [ "fp4.ads.stickyadstv.com.akadns.net" ], "image6.pubmatic.com": [ "pugm-nje1.pubmnet.com" ], "static.chartbeat.com": [ "d3f7zc5bbfci5.cloudfront.net" ], "a1.nyt.com": [ "nytimes.map.fastly.net" ], "token.rubiconproject.com": [ "pixel.rubiconproject.net.akadns.net" ], "pr-bh.ybp.yahoo.com": [ "ds-pr-bh.ybp.gysm.yahoodns.net" ], "simage2.pubmatic.com": [ "pug-vac.pubmnet.com" ], "i6.liadm.com": [ "idaas6.cph.liveintent.com" ], "eb2.3lift.com": [ "us-east-eb2.3lift.com" ], "image2.pubmatic.com": [ "pug-vac.pubmnet.com" ], "image4.pubmatic.com": [ "spug-vac.pubmnet.com" ], "image8.pubmatic.com": [ "imagesync-vac.pubmnet.com" ], "pixel.rubiconproject.com": [ "pixel.rubiconproject.net.akadns.net" ], "px.ads.linkedin.com": [ "l-0005.l-msedge.net" ], "match.sharethrough.com": [ "match-us-east-1-ecs.sharethrough.com" ], "ce.lijit.com": [ "raptor-prd-ue1-alb-1693497337.us-east-1.elb.amazonaws.com" ], "prebid.a-mo.net": [ "dc13-prebid.a-mx.net" ], "5290727.fls.doubleclick.net": [ "dart.l.doubleclick.net" ], "simage4.pubmatic.com": [ "spug-vac.pubmnet.com" ], "collector.brandmetrics.com": [ "waws-prod-dm1-179-pork.centralus.cloudapp.azure.com" ], "match.deepintent.com": [ "m.deepintent.com" ], "sync-tm.everesttech.net": [ "h2.shared.global.fastly.net" ], "sync.bfmio.com": [ "io-cookie-sync-1725936127.us-east-1.elb.amazonaws.com" ], "ups.analytics.yahoo.com": [ "ats-eks.us-east-1.dcs-online-targeting-prd.aws.oath.cloud" ], "pm.w55c.net": [ "cdn.w55c.net" ], "b1sync.zemanta.com": [ "chidc2.outbrain.org" ], "pixel-us-east.rubiconproject.com": [ "pixel-us-east.rubiconproject.net.akadns.net" ], "i.liadm.com": [ "idaas-ext.cph.liveintent.com" ], "cms.quantserve.com": [ "global.px.quantserve.com" ], "x.bidswitch.net": [ "user-data-us-east.bidswitch.net" ], "live.primis.tech": [ "d2wcz8sc48ztgm.cloudfront.net" ], "pubmatic-match.dotomi.com": [ "bfp.global.dual.dotomi.weighted.com.akadns.net" ], "sync.technoratimedia.com": [ "adserver.technoratimedia.com" ], "rtb-csync.smartadserver.com": [ "rtb-csync-use1.smartadserver.com" ], "bh.contextweb.com": [ "lga-direct-bgp.contextweb.com" ] }, "ccpa_link": { "hasCCPALink": false, "CCPALinkPhrases": [] } } } ```