EFForg / privacybadgerfirefox-legacy

LEGACY Privacy Badger for Firefox SEE README
407 stars 68 forks source link

Missing some ad-related requests #5

Closed garrettr closed 10 years ago

garrettr commented 10 years ago

STR: with Privacy Badger (2ff970f88f02ca38f21309ca9bc3b81971ab3f3c), visit cnn.com. Examine console log output.

One message stands out (I'm not sure how long after loading it takes to appear):

console.log: privacybadgerfirefox: No window associated with request: http://dt.adsafeprotected.com/dt?asId=f6fd74ed-84a6-11e3-ab91-0025903700d2&tv={c:29mUo0,pingTime:-1,time:73742,type:u,fc:0,rt:0,cb:0,np:1,em:false,fr:true,slTimes:{i:20428,o:53314,n:0,pp:0,pm:0,gpp:0,gpm:0,gi:20428,go:53314,gn:0,fi:73429,fo:0,fn:313},slEvents:[{sl:i,fsl:fn,gsl:gi,t:11,wc:0.0.1200.539,ac:,am:s,cc:,piv:100,obst:na,th:0,reas:},{sl:o,fsl:fi,gsl:go,t:8543,wc:0.0.1200.539,ac:,am:s,cc:,piv:100,obst:na,th:1,reas:f},{sl:i,fsl:fi,gsl:gi,t:61857,wc:0.0.1200.539,ac:,am:s,cc:,piv:100,obst:na,th:0,reas:}],slEventCount:3,uf:0,tt:jss,fm:otPshhJ+1*.23421-1878490|11|12,dtt:37,pc:0,ov:0}&br=g&adsafePrivacyPolicy=http://integr.al/privacy-policy

Clearly advertising-related, yet it was ignored (not processed by the heuristic blocker) because it err'ed out of getWindowForChannel. My hunch is that this was an XHR from an advertiser's script, and that the "windowForChannel" code doesn't resolve them correctly.

garrettr commented 10 years ago

Talked to Peter about this. General problem is nsIXMLHttpRequest's don't QI to nsIDOMWindow the same way nsIChannel's do. That means we can't use the same logic to determine if they're first/third party and extract the origin info. One solution might be to detect the type of request, and use a different QI chain if it's XHR (I'm not sure if that's possible).

Another solution might be to do XHR-related updates from the nsIContentPolicy, rather than in the http-on-examine-response observer. In the context, it is easy to determine if a request is XHR (aContentType == nsIContentPolicy.TYPE_XHR) and whether it is 3rd party (getBaseDomain(aContentLocation) == getBaseDomain(aRequestOrigin)). The only problem is that I'm not sure how to access any related cookies. In the original example, the contents of the request itself is relevant as well.

I've run into this issue before, while working on logging blocked cross-site XHR for Firefox. I've talked to bz about it and IIRC there's a way to get the window from an XMLHttpRequest, although he did not encourage me to use it.

garrettr commented 10 years ago

On the other hand, it might be ok to ignore these, assuming that our heuristics eventually block the loading of the 3rd party scripts entirely. These scripts must be storing local state somewhere (either in page cookies or localStorage), so as long as we can detect that then we might be ok.

garrettr commented 10 years ago

Here's another example (from mozilla.org):

console.log: privacybadgerfirefox: No window associated with request: https://246059135.log.optimizely.com/event?a=246059135&d=25134714&y=false&s245617832=none&s245875585=direct&s245677587=ff&s246048108=false&s237061344=none&s237321400=ff&s237335298=search&s237485170=false&n=https%3A%2F%2Fwww.mozilla.org%2Fen-US%2Fabout%2F&u=oeu1390506273861r0.06090165024070571&wxhr=true&t=1390533849524&f=543162200,553600251
garrettr commented 10 years ago

Looks like web fonts also have this problem:

console.log: privacybadgerfirefox: No window associated with request: https://mozorg.cdn.mozilla.net/media/fonts/OpenSans-Bold-webfont.woff?2013
garrettr commented 10 years ago

More examples:

A tracking beacon (same URL as the original example)?

console.log: privacybadgerfirefox: No window associated with request: http://static.adsafeprotected.com/detector3.pix

The Rubicon Project tracks you with buggy JS (from theguardian.com):

JavaScript strict warning: http://ads.rubiconproject.com/ad/7845.js, line 1: anonymous function does not always return a value
JavaScript strict warning: http://ads.rubiconproject.com/ad/7845.js, line 1: test for equality (==) mistyped as assignment (=)?
JavaScript strict warning: http://ads.rubiconproject.com/ad/7845.js, line 1: reference to undefined property this.context.rp_tracking
JavaScript strict warning: http://b.scorecardresearch.com/beacon.js, line 1: assignment to undeclared variable ns_p
JavaScript strict warning: http://b.scorecardresearch.com/beacon.js, line 1: assignment to undeclared variable COMSCORE
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 196: function shuffle does not always return a value
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 236: function doPixels does not always return a value
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 221: function doPixels does not always return a value
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 258: reference to undefined property expiration_info[nid]
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 212: assignment to undeclared variable e
JavaScript strict warning: http://tap2-cdn.rubiconproject.com/partner/scripts/rubicon/emily.html?rtb_ext=1&pc=7845/13215&geo=na&co=us, line 164: reference to undefined property pixel.info.sample
console.log: privacybadgerfirefox: No window associated with request: http://assets.rubiconproject.com/campaigns/100/11/64/32/1338569818Guardian_TheWholePic_728X90_Protest.swf?clickTAG=http%3A%2F%2Foptimized-by.rubiconproject.com%2Ft%2F7845%2F13215%2F25061-2.3317248.3372862%3Furl%3Dhttp%3A%2F%2Fwww.guardian.co.uk%2Fcommentisfree%2Fus-edition&clickTag=http%3A%2F%2Foptimized-by.rubiconproject.com%2Ft%2F7845%2F13215%2F25061-2.3317248.3372862%3Furl%3Dhttp%3A%2F%2Fwww.guardian.co.uk%2Fcommentisfree%2Fus-edition

More (very likely) XHR:

console.log: privacybadgerfirefox: No window associated with request: http://api.nextgen.guardianapps.co.uk/most-read/us.json?_edition=us

Looks like objects (plugins) might have this problem too:

console.log: privacybadgerfirefox: No window associated with request: http://z.cdn.turner.com/xslo/cvp/assets/container/
garrettr commented 10 years ago

Ok, so far the 3 categories of things that will probably need special handling are:

  1. XHR
  2. Favicons
  3. Web fonts
garrettr commented 10 years ago

Re-running the STR today (on dev/yan), I only see errors for favicons, so I'm going to focus on those first.

monicachew commented 10 years ago

Why can't this be done in nsIContentPolicy for everything?

----- Original Message -----

Ok, so far the 3 categories of things that will probably need special handling are:

  1. XHR
  2. Favicons
  3. Web fonts

Reply to this email directly or view it on GitHub: https://github.com/EFForg/privacybadgerfirefox/issues/5#issuecomment-35026907

garrettr commented 10 years ago

Why can't this be done in nsIContentPolicy for everything?

That is an option I am considering.

garrettr commented 10 years ago

Going to revisit this once the updating/blocking architecture has settled.

Why can't this be done in nsIContentPolicy for everything?

We don't get a channel in shouldLoad, so it might be harder to get the request and look at the cookies. I'll see if I can reliably QI to one (maybe from aContext?).

This does make a lot of sense because otherwise we duplicate a bunch of preliminary work, i.e. the checks to see if the request should be considered at all.

diracdeltas commented 10 years ago

I'm seeing a lot of advertising-related Flash files (.swf) that have this problem:

console.log: privacybadgerfirefox: No window associated with request: https://static.doubleclick.net/3464050/HSBC_BAU_300x250.swf
diracdeltas commented 10 years ago

It turns out that channel.notificationCallbacks is no longer recommended for getting the window associated with a request (https://bugzilla.mozilla.org/show_bug.cgi?id=457153#c16). Will try replacing with http://dxr.mozilla.org/mozilla-central/source/docshell/base/nsILoadContext.idl.