gbaptista / luminous

Identify, analyze and block code execution and event collection through JavaScript in your browser with code interception.
https://gbaptista.github.io/luminous
GNU General Public License v3.0
95 stars 11 forks source link

Not being able to guarantee interception #55

Open ghostwords opened 6 years ago

ghostwords commented 6 years ago

This is a problem with current WebExtensions APIs: we can either guarantee injection (into a frame) before anything else will happen, or we can inject conditionally/pass configuration to the injected script, but not both. There are workarounds and promising upcoming APIs, however.

Workarounds:

Upcoming APIs:

gbaptista commented 6 years ago

@ghostwords This is a major concern and I have tried several approaches to fix the problem. (working with cookies, local storage and other things).

Many thanks for sharing the Workarounds and Upcoming APIs. I will read all the discussions, there is certainly some gold there that I can use to try some solution for this!

gbaptista commented 6 years ago

@ghostwords I believe I have found a solution (reading all the discussions and references) to ensure the interception. I used the cookie idea with a fallback to onCommitted solution.

I wrote some notes about my experience and the result: guides/how-it-works/interception.md

I'll keep an eye on the Upcoming APIs. Thanks again for sharing all this, it helped a lot to improve the project!

ghostwords commented 6 years ago

Nice, I agree that the cookie approach seems to be the only workaround that provides synchronous messaging. However, the cookie workaround comes with many downsides, some already documented in your guide. One that comes to mind is the uncertainty around multiple frames (https://github.com/snyderp/web-api-manager/issues/14#issuecomment-338842859), another is it breaking in cookie-restricted environments (https://github.com/snyderp/web-api-manager/issues/53).

I'm going to spend a little more time reviewing the options as part of https://github.com/EFForg/privacybadger/pull/1861. Current feeling is to keep complexity low and accept the injection delay, and later switch to using chrome.contentScripts.register() in Firefox and lobby Chrome for a similar API. (Shouldn't Chrome and Firefox work together to keep WebExtensions as cross-browser aligned as possible?)

gbaptista commented 6 years ago

@ghostwords An update on this subject, I created a page with random iframes and random load times to better visualize the interceptions and I got some curious results: https://gbaptista.github.io/luminous/html/demos/interception/

I just open the page and try to block the execution of addEventListener('click'). As you already pointed out, the cookie strategy has problems and we ended up deleting the cookie before the time or letting the cookie leak for some reason. When the cookie fails, we have different results (in this scenario of multiple frames): Sometimes the storage.sync runs faster than the onCommitted. This is in the Luminous scenario, of course, but we may have similar results for other extensions.

Current feeling is to keep complexity low and accept the injection delay.

I understand this approach, it makes sense. In the case of Luminous we end up losing too many blocks with this, it's complicated.

and later switch to using chrome.contentScripts.register() in Firefox

I will do some experiments with contentScripts.register and webRequest.filterResponseData.

and lobby Chrome for a similar API

Could I help in any way in this?

Shouldn't Chrome and Firefox work together to keep WebExtensions as cross-browser aligned as possible?

Definitely!

Some screenshots of my experiments:

gbaptista commented 6 years ago

@ghostwords I tried both APIs (contentScripts.register and webRequest.filterResponseData). My impressions and results:

contentScripts.register

It did not work for me.

The extension may have different rules for 100 different domains, loading all rules with matches may create performance problems, it is tricky to try to deal with this without consuming unnecessary resources. Injecting the rules of all domains into any loaded page also did not seem to me an efficient option.

In my case I want to load rules inside iframes according to the domain of the main frame, I did not find a way to be able to do this type of verification with content_scripts and ended up loading rules only by domain.

So, for those who have only global rules for any domain may be a good choice, it is not my case.

Warning: using chrome.contentScripts.register instead of browser.contentScripts.register apparently works, but with the chrome alias the register call does not return a Promise, I wasted a lot of time until I figured this out.

webRequest.filterResponseData

It worked for me, but with some workarounds that can be tricky.

You need to know exactly where to insert the code, creating a script tag before a DOCTYPE tag for example will break the page. > workaround

Some sites (inbox.google.com, youtube.com, facebook.com...) make a request for new documents and just inject what was received into the current document and this can duplicate your injected code. > workaround

It is not possible (if anyone knows how, I will be very grateful!) from the perspective of a content_script (synchronously) to know if the browser has filterResponseData support. So deciding whether a browser will use filterResponseData or some other strategy (cookie, onCommitted ...) is a problem. Also, when the first content_script is executed it can not see the code injected by filterResponseData immediately, which makes things even harder... The workaround is trying to identify the browser used to determine the strategy, something I'm definitely not happy about...

Results

Workarounds aside, with filterResponseData it was possible to inject the code without any delay and without needing cookies (Firefox only):

selection_372

Related resources

Atavic commented 6 years ago

...and lobby Chrome for a similar API

Could I help in any way in this?

https://groups.google.com/a/chromium.org/ http://www.chromium.org/developers/technical-discussion-groups

gbaptista commented 6 years ago

Another issue with filterResponseData: Fix Firefox binary files issue #94

gbaptista commented 6 years ago

@Atavic: Despite the interest, apparently no one is willing to work on this:

I have no idea how difficult that would be to implement this. I'll try to search for how it was done in Firefox... A good knowledge in C++ is certainly necessary, but what scared me was to see the requirements to build Chromium and try to implement something like this: