Open ghostwords opened 2 years ago
This is not an issue with blocking webRequest because we can write imperative code that correctly decodes URL components. For example:
let url = new URL(details.url).searchParams.get('q');
return { redirectUrl: url };
As noted during today's call, while URL component encoding is the standard approach to preserving special characters in URL components, it's possible for URL redirectors (like https://www.google.com/url?q=SOMEURL
) to use or to switch to a custom encoding (such as Base64, or something completely custom). A custom encoding will defeat all DNR-based URL cleaner/privacy extensions, until DNR is extended to support that particular encoding. This is an example of https://github.com/w3c/webextensions/issues/151#issuecomment-1018778881.
it's possible for URL redirectors to use or to switch to a custom encoding
That defeatism argument is a recurring one and had been often used to rationalize not trying to defuse one way or another some mechanisms in the wild.
By that logic there is no point doing anything since everything can be worked around by websites. Of course we do not give up on the basis of that argument, and it works -- an approach does not have to be guaranteed to work everywhere to be useful, it has to work for enough cases.
If anything, it further shows how webRequest is useful, whereas one party may not be motivated in extending capabilities solely based on this defeatism argument, another one will be motivated and it may very well turn out that the end result is beneficial to end users. Now with MV3 the parties which are motivated to address the issues lose the ability to be proactive about it.
@gorhill, I tried to point out that this is an example of DNR inherently disadvantaging privacy and security extensions. (With webRequest
, any extension can update itself quickly to respond to a change in how a redirector works. With DNR, extensions will have to first convince each browser vendor to update the DNR API.)
I did not mean to suggest there is no point in addressing the most common scenario, namely extracting, URL-decoding and redirecting to some portion of a given URL.
Sorry, I didn't have you in mind when I posted my comment, and I actually didn't even see the discussion about this and made assumptions about how it went just from your comment, I should have waited to find out if my comment applied. I will go read the discussion.
So yeah, after reading the discussion about it, there was no point for my comment, sorry. Next time I will be more carefully to avoid pointless noise.
As maintainer of CleanLinks I actually am in this use case. Many other extensions (list dated 2018) rely on the same mechanisms.
I must say real-life use cases are often more complicated and can use several nested redirections, “customised” url-encoded or base64, improperly encoded URLs, embedded URLs in the path instead of the query parameters, in the hash of the URL, etc.
Some examples:
https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.fsf.org%2Fcampaigns%2F&h=ATP1kf98S0FxqErjoW8VmdSllIp4veuH2_m1jl69sEEeLzUXbkNXrVnzRMp65r5vf21LJGTgJwR2b66m97zYJoXx951n-pr4ruS1osMvT2c9ITsplpPU37RlSqJsSgba&s=1
cleaned: https://www.fsf.org/campaigns/
https://forum.donanimhaber.com/externallinkredirect?url=https://www.amazon.com.tr/HP-6MQ72EA-Intel-Diz%C3%BCst%C3%BC-Bilgisayar/dp/B07PYT39WV/ref=sr_1_19?fst=as%3Aoff
cleaned: https://www.amazon.com.tr/HP-6MQ72EA-Intel-Diz%C3%BCst%C3%BC-Bilgisayar/dp/B07PYT39WV?fst=as%3Aoff
https://trackmail.alumnforce.net/?tm_u=https%253A%252F%252Fax.polytechnique.org%252F%2523%252Fgroup%252Fx-alternative%252F211%252Fcalendar%252Fconference-frederic-lordon%252F2020%252F02%252F06%252F724&tm_h=68404220dbcf676445ae9f32a208f6bc
cleaned: https://ax.polytechnique.org/#/group/x-alternative/211/calendar/conference-frederic-lordon/2020/02/06/724
https://www.bing.com/fd/ls/GLinkPing.aspx?IG=9AFD7A29FFBB46F1B9A81FF058C0640E&&ID=SERP,5206.1&url=https%3A%2F%2Fwww.bing.com%2Fck%2Fa%3F!%26%26p%3D0d3eae76a1129f5f677a93348d0d5d6ee2f5906c36c38d0cad86b467db7afa8aJmltdHM9MTY1MjkzNTY0NSZpZ3VpZD05YWZkN2EyOS1mZmJiLTQ2ZjEtYjlhOC0xZmYwNThjMDY0MGUmaW5zaWQ9NTIwNg%26ptn%3D3%26fclid%3Dc74e5d8f-d72e-11ec-a4f0-c181b1119cc1%26u%3Da1aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g_dj1YSHp0a0ZDemJUVQ%26ntb%3D1
cleaned: https://www.youtube.com/watch?v=XHztkFCzbTU
https://www.tripadvisor.com.au/ShowUrl-a_partnerKey.1-a_url.https%3A__2F____2F__play__2E__google__2E__com__2F__store__2F__apps__2F__details__3F__id%3Dcom__2E__tripadvisor__2E__tripadvisor__26__hl%3Den__26__referrer%3Dutm__5F__download__5F__tracking%253DBrand__5F__AppPage__5F__0__5F__18034-a_urlKey.8817ea41f0fea6faa.html
cleaned: https://play.google.com/store/apps/details?id=com.tripadvisor.tripadvisor&hl=en
It seems unlikely to me that the functionality for this use case can be provided without executing a function returning the properly cleaned link.
If security is the main concern here, this function could be executed in a restricted context (for this use case at least). It probably needs some static inputs (the code to run, a set of rules), but could further be prevented from making requests or otherwise communicating with anything else than URL input/output after setup.
At a high level, I'm supportive of this use case. I think there will be potential challenges and subtlety in the API and implementation, but I think it's something worth looking into.
The Chromium bug for this issue: https://issues.chromium.org/issues/338071843
It does not appear possible to properly extract and redirect to URL-encoded components of URLs with Declarative Net Request.
For example, an extension may want to "clean"
https://www.google.com/url?q=
redirect URLs by extracting the value of theq
parameter and issuing an internal redirect to that destination URL, in order to avoid unnecessary network requests and reduce data leakage to Google.This is an important use case for URL cleaning extensions specifically, and privacy extensions in general.
Related to #110.
Demo extension (zip):
manifest.json
url_cleaning_rules.json
Example inputs and outputs:
https://www.google.com/url?q=https://httpbin.org/anything?test1%3D%25E8%25B1%2586%25E5%25A5%25B6&sa=D&source=editors&ust=1666207867209386&usg=AOvVaw2aBPOGUUgM54kszb7IMQhM
expected: https://httpbin.org/anything?test1=%E8%B1%86%E5%A5%B6 actual: https://httpbin.org/anything?test1%3D%25E8%25B1%2586%25E5%25A5%25B6
https://www.google.com/url?q=https://httpbin.org/anything?test2%3Dabcd123%2B3cf%3D%3D&sa=D&source=editors&ust=1666207869017685&usg=AOvVaw3BKYCeKLY_kbPqqbuf97nm
expected: https://httpbin.org/anything?test2=abcd123+3cf== actual: https://httpbin.org/anything?test2%3Dabcd123%2B3cf%3D%3D
https://www.google.com/url?q=https://httpbin.org/anything?test3%3Dabcd123%252B3cf%253D%253D&sa=D&source=editors&ust=1666207870909835&usg=AOvVaw0BmWQOYBQUSmkBqfUbh5ix
expected: https://httpbin.org/anything?test3=abcd123%2B3cf%3D%3D actual: https://httpbin.org/anything?test3%3Dabcd123%252B3cf%253D%253D
https://www.google.com/url?q=https://httpbin.org/anything?q%3D%25B6%25B9%25C4%25CC&sa=D&source=editors&ust=1666207872595800&usg=AOvVaw07Q5JDI96UUoiXppYrxdQb
expected: https://httpbin.org/anything?q=%B6%B9%C4%CC actual: https://httpbin.org/anything?q%3D%25B6%25B9%25C4