arkenfox / user.js

Firefox privacy, security and anti-tracking: a comprehensive user.js template for configuration and hardening
MIT License
10.12k stars 515 forks source link

Request Control Filters #149

Closed crssi closed 7 years ago

crssi commented 7 years ago

This is Request Control | GitHub This is our Request Control Filters Wiki

Post your Request Control Filter-fu here

Note: Examples below are for discussion / testing - use/test at your leisure, post feedback. As really cool filters emerge, we will put them in the wiki

crssi commented 7 years ago

NOTE: ga_ stuff might need XHR type also Pure URL extension replacement:

Pattern: Any URL Types: Document, Embedded document (not sure for second one) Action: Filter Filter URL: Off Trim URL Parameters: ref_, utm_source, utm_medium, utm_term, utm_content, utm_campaign, utm_reader, utm_place, ga_source, ga_medium, ga_term, ga_content, ga_campaign, ga_place, yclid, _openstat, fb_action_ids, fb_action_types, fb_ref, fb_source, action_object_map, action_type_map, action_ref_map, ws_ab_test, btsid, algo_expid, algo_pvid, sid, utm_name, utm_cid, utm_reader, utm_viz_id, utm_pubreferrer, utm_swu, icid, _hsenc, _hsmi, mkt_tok, sr_share, vero_conv, vero_id, nr_email_referer, ncid

Samples: http://bigpicture.ru/?p=431513&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+bigpictures+%28%D0%9D%D0%9E%D0%92%D0%9E%D0%A1%D0%A2%D0%98+%D0%92+%D0%A4%D0%9E%D0%A2%D0%9E%D0%93%D0%A0%D0%90%D0%A4%D0%98%D0%AF%D0%A5%29

UPDATE: Added @Atavic suggested "Trim URL Parameters"

crssi commented 7 years ago

Redirect to URL without REF tracking:

Pattern: Any URL Types: Document, Embedded document (not sure for second one) Action: Redirect Redirect To: {href/(.?)ref=./$1} {href/(.*\/)ref=.*/$1}

Samples: https://www.amazon.com/AmazonBasics-Type-C-USB-Male-Cable/dp/B01GGKYQ02/ref=sr_1_1?s=amazonbasics&srs=10112675011&ie=UTF8&qid=1489067885&sr=8-1&keywords=usb-c

crssi commented 7 years ago

Remove the crap and possible tracking over URL manipulation after images and CSS:

Pattern scheme: http/https Pattern host: Pattern path: *.jpg?*, *.gif?*, *.png?*, *.svg?*, ~~*.css?\~~ Types: Document, Embedded document (not sure for second one), Stylesheet, Image Action: Filter Filter URL: Off Trim URL Parameters: Trim all

UPDATE: Added Pattern path: *.css?* Stylesheet breaks this page https://technet.microsoft.com/en-us/library/2009.07.cableguy.aspx

Added Types: Document, Embedded document

Samples: http://www.24ur.com/

Sample that doesn't work, but it should... don know why, yet: UPDATE: It works now. https://thechive.files.wordpress.com/2017/06/dd0523324267789bb8beecc5d7914970.jpg?quality=85&strip=info&w=600

BRAINING: maybe there is same method to be added also for Font and Media types... will see.

crssi commented 7 years ago

UPDATE: Use "Skip Redirect" from AMO... its way better. Facebook redirect without visiting facebook:

Pattern scheme: http/https Pattern host: l.facebook.com Pattern path: *u=* Types: Document, Embedded document (not sure for second one) Action: Filter Filter URL: On

Sample: https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.fsf.org%2Fcampaigns%2F&h=ATP1kf98S0FxqErjoW8VmdSllIp4veuH2_m1jl69sEEeLzUXbkNXrVnzRMp65r5vf21LJGTgJwR2b66m97zYJoXx951n-pr4ruS1osMvT2c9ITsplpPU37RlSqJsSgba&s=1

crssi commented 7 years ago

Funny... I have to look into it too. Anyway, I have updated for CSS too... see change up. One site is particulary full of this crap to test... see updated post.

crssi commented 7 years ago

Hah... resolved, done. Have phun

crssi commented 7 years ago

UPDATE: import/export is done. The author is working now on import/export. ;)

@Thorin-Oakenpants - if you wish to clean the thread, you can delete my last 3 posts (including this one) and your "didn't work" too, since its resolved.

Cheers

Atavic commented 7 years ago

Here some Google's Urchin Tracking Modules that aren't listed in Trim URL Parameters list:

utm_name
utm_cid
utm_reader
utm_viz_id
utm_pubreferrer
utm_swu

Also, url-tracking-stripper has a few more for other trackers:

ICID
icid
_hsenc
_hsmi
mkt_tok
sr_share
vero_conv
vero_id
nr_email_referer
ncid
earthlng commented 7 years ago

Thanks @crssi !! :+1:

a couple nits if you don't mind...

to give you an example, this is what I came up with for a bing-redirect: {search/^\?.*&?q=(.+?)(&.*|#.*|$)/https://www.ixquick.eu/do/dsearch?query=$1&cat=web&pl=opensearch&language=english} - it covers everything I could think of, namely:

and even then, I'm sure I missed something - regexes are almost impossible to get right

another one from my small collection:

crssi commented 7 years ago

You are right. Need to rethink the REF one, it was fishy to me also and I don't like it at all.... will put a "Under construction" note on that post.

The ga_ stuff, should we separate into another filter extended with XHR or just enable the XHR on the current one. I will enable here now and see if I will get any breakage... will also put a note in that post.

I have another "nasty" one, which could replace "Skip Redirect" extension, but it is also fishy... for now I am using Redirector, since I can use regex (not that I like it, but can't see better way.

I can place the nasty one in another post together with the Redirector approach for your critics?

earthlng commented 7 years ago

Have you installed and looked at Redirector yet?

nope

Does Request Control offer anything that Redirector doesn't or vice versa?

IDK, you tell me :) I'll try it someday. My way too complex bing regex is probably not necessary. It looks like the addon strips additional parameters automatically. I'll try to come up with a simpler solution

earthlng commented 7 years ago

I already edited my post dude - chillax mate :) I'll probably end up using both

crssi commented 7 years ago

Redirector has some problems, when clicking URL from external program it doesn't always work as intended. Another thingy is that Redirector parses input only by one filter (just my observation) and doesn't pass the result over filters again unitl nothing left to filter (it happens that you want to filter out more than only one thing). Otherwise it has also exclusions which is great and better string manipulation over regex. Both are very good.

crssi commented 7 years ago

@Thorin-Oakenpants do you have any sample for ga_ stuff, so I can dig in? After tomorrow I will be offline for about 5 days.

crssi commented 7 years ago

Updated RedirectTo

does not cover @Thorin-Oakenpants sample: http://foo.bar/redirect?href=bar.foo/index.html

crssi commented 7 years ago

UPDATE: Use "Skip Redirect" from AMO... its way better. Now a nasty one... trying to mimic part of "Skip redirect" NOTE: this one can and will make some login breakage... for tests and brainstorming only Does not interfere any of rules posted on this page.

It consists 2 filter rules. First one does the skipping and the second one whitelists login pages to avoid breakage.

Pattern: Any URL Types: Document Action: Filter Filter URL: On

Pattern scheme: http/https Pattern host: Pattern path: *signin*, *signout*, *login*, *logout*, *logon*, *logoff*, *auth*, *account*, *eBayISAPI.dll*, *ServiceLogin*, *ServiceLogout*, *AccountChooser* (maybe you can add also *option\, but I don't like it) Types: Document Action: Whitelist

Samples: Working redirect https://outgoing.prod.mozaws.net/v1/b928e4237edbbdd2646a3971d2e6b514aee033c10f3f4c49415bf93096405f38/http%3A//www.google.com/chrome/%3Fi-would-rather-use-firefox=http%253A%252F%252Fwww.mozilla.org/

www.google.com/chrome/?i-would-rather-use-firefox=http%3A%2F%2Fwww.mozilla.org/

Working login pages https://signin.ebay.de/ws/eBayISAPI.dll?SignIn&UsingSSL=1&pUserId=&co_partnerId=2&siteid=77&ru=http%3A%2F%2Fmy.ebay.de%2Fws%2FeBayISAPI.dll%3FMyEbayBeta%26MyEbay%3D%26gbh%3D1%26guest%3D1&pageType=3984

https://www.amazon.com/ap/signin?_encoding=UTF8&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.mode=checkid_setup&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.pape.max_auth_age=0&openid.returnto=https%3A%2F%2Fwww.amazon.com%2Fgp%2Fyourstore%2Fcard%3Fie%3DUTF8%26ref%3Dcust_rec_intestitial_signin

https://manage.autodesk.com/service/authentication/login?returnUrl=https%3A%2F%2Fmanage.autodesk.com%2Fcep%2F

https://account.xiaomi.com/pass/serviceLogin?callback=http%3A%2F%2Fglobal.mi.com%2Fen%2Flogin%2Fcallback%3Ffollowup%3Dhttp%253A%252F%252Fwww.mi.com%252Fen%252F%26sign%3DZDk4YmY0MTRkMThmODExYTE0MDljYWRmYzczZmNjOGZjNDAzNzU0Mg%2C%2C&sid=mi_overseaen&_locale=en

https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&ct=1495461484&rver=6.7.6643.0&wp=MBI_SSL_SHARED&wreply=https:%2F%2Fmail.live.com%2Fdefault.aspx%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai&lc=1033&id=64855&mkt=en-us&cbcxt=mai

Currently broken login pages https://login.leagueoflegends.com/?region=eune&lang=en_PL&redirect_uri=http%3A%2F%2Feune.leagueoflegends.com%2F I can't figure out how to match this one since "login" is part of hostname and not path and Request Control doesn't cover this... here Redirector is better.

crssi commented 7 years ago

Crap... at least "path" definition is case sensitive. I have placed an issue about that.

UPDATE: Case sensitive "bug" is actually "deal breaker" until corrected. Shame, started to love this one. :(

Atavic commented 7 years ago

:-1: Redirector always loads the unmodified address.

Maybe we should add Request Control to Appendix B: Firefox-Add-ons?

crssi commented 7 years ago

^^ Ahh, this now perfectly explains my observation about Redirector and what I didn't like about it. But how then uBo, uMatrix and Request Controls are dealing perfectly in those cases.

Atavic commented 7 years ago

URL path definition is case sensitive

The absolute URL path: http://example.com/data/ is not case sensitive, as per RFC 3986.

While all that comes after can be case sensitive, depending on web server settings, OS...

Everything after the RFC Standard is unclear.

earthlng commented 7 years ago

@crssi wrote:

do you have any sample for ga_ stuff

any site that uses GoogleAnalytics, fe ghacks. GA should already be blocked by uBO or uMatrix or whatnot and I'm not sure we really need a filter for that stuff.

earthlng wrote

My way too complex bing regex is probably not necessary. It looks like the addon strips additional parameters automatically

Yeah, it doesn't, at least not for "Action: Redirect". Redirector doesn't make it easier either. But I realized I don't need to account for anchors because they are not part of the "search" parameter, which results in this slightly adjusted bing-redirect:

... pick whichever format you prefer.

edit: fuck! the ^\?.*&?q= part is flawed, for example: ?testq=wrong without a q=right, however it works for ?testq=disney&q=porn but not for ?q=porn&testq=disney, ie if you want porn instead of disney :) got it! (^\?|.*&)q=

edit2: for Redirector: https?://www.bing.com/search(\?|.*&)q=(.+?)(&.*|#.*|$) - this one "needs" the anchor part again! Redirect to: https://www.ixquick.com/do/dsearch?query=$2&cat=web&pl=opensearch&language=english

please try to break this one y'all ;)

earthlng commented 7 years ago

We need some really generic ones, but I have no idea how bad the breakage could get.

I would stay away from generic ones if you care about efficiency and no breakage.

Does Request Control offer anything that Redirector doesn't or vice versa?

Redirector

Redirector's author said the addon is in "maintenance mode" and he won't add new features. RC's author still seems to be very actively maintaining the addon and is open to add new features.

RC does some "magic" in the background which is great (and seems to work) but not very transparent. fe. it strips additional parameters (&xyz=foo&bar=etc) with "Filter" rules and does urldecoding if necessary. I'd personally prefer a bit more control over this, especially the encoding/decoding stuff. Maybe we can lobby him/her to add that (with a default "auto" option for the current behavior)

edit: Redirector allows/requires to order the rules but idk how useful that is. It probably makes it more complicated if you have a lot of rules

GitCurious commented 7 years ago
  1. https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310175387

This appears to be breaking some internal Amazon links (such as "Warehouse Deals" for example), presumably it`s the ref= part.

  1. https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310431067

With regard to this, how exactly are these "2" filter rules applied ? Separately ?, must they be in a specific order ?

This stuff gives me a headache at the best of times - intriguing stuff though !

Forsaked commented 7 years ago

OT: @Thorin-Oakenpants Good to know, looks like im not John Cena!

earthlng commented 7 years ago

FYI atm there's a bug that a redirected Document still gets added to the history. However the request doesn't touch the network. (that was fixed in 54 and also backported to ESR here) re: https://www.reddit.com/r/firefox/comments/6j6w0v/how_to_automatically_clean_up_urls_by_removing/djc7w8v/

southwindcg commented 7 years ago

The *.jpg?*, *.png?* ... image filters break images on www.kickstarter.com.

Atavic commented 7 years ago

Try whitelisting https://ksr-ugc.imgix.net/

crssi commented 7 years ago

Will keep eye on this addon, since it has a lot of potential, but will hold until case insesitive isn't sorted out.

crssi commented 7 years ago

@GitCurious REF is tricky and WILL make breakage. There are no orders in RC. The first rule takes the last URL in the URL (damn if this makes sense :)) And the second skips this action when you hit some login pages.

crssi commented 7 years ago

There are also some other issues. I have placed an Issue on RC github.

crssi commented 7 years ago

Sure you can. Its just a JSON. This is the same: [{"pattern":{"scheme":"*","host":["www.imdb.com"],"path":["title/*","name/*","character/*"]},"types":["main_frame"],"action":"filter","active":true,"paramsFilter":{"values":["ref_"],"pattern":"ref_"},"skipRedirectionFilter":true}]

crssi commented 7 years ago

Damn cool... how do you do that?

crssi commented 7 years ago

Ah, ok, I see now... thx :)

Atavic commented 7 years ago

@crssi Google has &biw= and others, see RemoveGoogleTracking.

crssi commented 7 years ago

Thank you @Atavic. Will take a look into in the next days

crssi commented 7 years ago

For redirect skipping RC is no match to Skip Redirect (WE), especially now where author implemented same domain detection... see https://github.com/sblask/webextension-skip-redirect/issues/30

I will not try to mimic the same functionality on RC anymore.

For basic tracking removal the Link Cleaner does a nice job, but RC can be used as a supplement for more advanced rules or for additional tracking options removal that are not covered in LC.

Atavic commented 7 years ago

@ghacksuserjs from RemoveGoogleTracking:

'biw', // offsetWidth 'bih', // offsetHeight

Apparently related to screen fingerprint.

crssi commented 7 years ago

@Atavic To to be sure... should I add to https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310173763 the following: biw, bih, ei, sa, ved, source, prmd, bvm, bav, psi, stick, dq, ech, gs_gbg, gs_rn, cp, scroll, vet, yv, ijn, iact, forward, ndsp, csi, tbnid, pbx, dpr, pf, gs_rn, gs_mss, pq, cp, oq, sclient, gs_l, aqs, psi

Observation: From what I see on the source code of RemoveGoogleTracking, its more complex than just to remove those parameters.

crssi commented 7 years ago

I have redesigned some rules and dropped some from the scratch and have issued few issues to @tumpio... specially this one. When some more is known about the issue I will close this topic and open a new topic (referenced to this one) and post from scratch.

crssi commented 7 years ago

@Thorin-Oakenpants If you agree I would like to close this topic, since @tumpio is not active almost 2 months and there is better approach to deal with skipping redirect and url cleaning. I can open new "post" with all together solution with setup I am using for skip redirect and url cleaning in the next few days?

earthlng commented 7 years ago

example to skip youtu.be: pattern: scheme: http/https host: youtu.be path: * types: Document action: Redirect redirect to: https://www.youtube.com/watch?v={pathname:1}

example url: https://youtu.be/nqbUkThGlCo

Request Control supports direct access to named parameters The named parameter pathname in the example link is /nqbUkThGlCo and {pathname:1} strips the slash using substring extraction

securingmom commented 3 years ago

My love to all of you