open-source-ideas / ideas

💡 Looking for inspiration for your next open source project? Or perhaps you've got a brilliant idea you can't wait to share with others? Open Source Ideas is a community built specifically for this! 👋
6.59k stars 220 forks source link

A WebExtension to detect and automatically revert sabotage in web pages #245

Open KOLANICH opened 4 years ago

KOLANICH commented 4 years ago

Project description

Lot of web pages have built-in sabotage to force their users to enable JavaScript in order to force them into executing malware. When JS is executed, it undoes the sabotage, but then the browser executes malware written in JS.

Often this sabotage is easy to revert just by using devtools inspector, just "inspect element" and then disable the checkbox. But now this sabotage is increasingly often becoming hard to detect and revert by using just a DevTools inspector, because it is not designed with this use case in mind.

So we need a WebExtension designed for this purpose.

  1. One kind of sabotage is hiding elements with display: none. It is easy to spot in the inspector - such elements, including internal ones, are displayed in gray font.
  2. Another kind of sabotage is visibility: hidden (the elements are present, but invisible).
  3. opacity: 0 or other small value.
  4. Font color matching the color of background.
  5. overflow: hidden is often abused to make webpages unscrollable.
  6. Very common - an overlay element with high z-index with an animated preloader.
  7. Much less common: z-indexes to put below and positions outside of viewport.
  8. Often webpage content is put into <script> tags using document.write or some other funcs.

The webesxtension should detect the sabotages present, display them in a GUI with checkboxes (a panel showed when clicking the ext toolbar button) and allow a user to undo the sabotage using some generic code not specific to a particular webpage. Also it should be possible to remember for each page and website which sabotage must be undone and also it should be possible to share the lists of sabotages.

Relevant Technology

Complexity and required time

Complexity

Required time (ETA)

Categories

KaKi87 commented 4 years ago

Do you have examples of websites that does this ?

KOLANICH commented 4 years ago

Yes I do have. But prefer not to call them in order to avoid possible lawsuits.

KaKi87 commented 4 years ago

Well, there's no way to truly understand then make implementation without having websites to try.

KOLANICH commented 4 years ago

You can discover such websites yourself. Just disable JavaScript and use the browser in this state for some time. And try to visit some of your favorite websites. Or non-favourite. The problem is that if a website takes such measures, it means its owners decided it is better for them to keep the folks like me away of their website. Fortunately not everything is that bad. Sometimes the problems are unintended and fixed by website owners soon after the fix is asked. Sometimes website owners just don't care. But sometimes the intent is clearly malicious, especially if the webpage contains fingerprinting and other tracking scripts and detectors of content blockers. But in these cases the sabotage is often more sophisticated than css and cannot be dealt automatically. I.e. I know a fucking webapp that is almost mandatory to be used, that does a sophisticated browser fingerprinting and just bans everybody who looks like having any countermeasures, and trolls that user simultaneously. The ban is done on server side, a malicious obfuscated script (quite large) sends the measurements to the server, and then server decides if it allows to login or simulates malfunctioning; and this is NOT reCAPTCHA, it is one of state-owned websites. Of course such sabotage cannot be dealt fully automatically.

But small css-based sabotage probably can be dealt automatically. A bit harder is to deal with html page content inlined into scripts tags as text strings, but it is not impossble. A popular technology-related forum uses this technique in order to keep the ones blocking ads out. A popular website about machine-learning also uses this technique (html within script tags). A website of a popular free open source library uses the technique with opacity, but I cannot say that using opacity there is intentional sabotage. They have a lot of accordeon widgets, implemented in JS, but to open them without js one has to go to document root and inspect and remove styles of elements one by one. I don't know why have they used opacity and visibility and were not satisfied by just display. Likely they wanted to achieve a fade-n and fade-out animation.

KaKi87 commented 4 years ago

I do know websites that does require JavaScript, yet it's nothing about tracking stuff that uBlock already blocks, but functional dependency : for instance, websites developed with front-end frameworks like VueJS (I do some), ReactJS or AngularJS among others simply won't work without client-side JS because it's purely based on it. Yet they can have tracking and that's uBlock's purpose. JS is also a functional dependency when using lazy loading for instance and for webapps in general.

KOLANICH commented 4 years ago

No, the sites I mean don't use these frameworks.

BenjaminHoegh commented 4 years ago

You do know that just because they require you to use JavaScript doesn't mean its a malware? Also, Google already scan websites for malware and if it found one they remove the website from the search result for 30 days and then make a new scan to see if the malware is still there

KOLANICH commented 4 years ago

Browser fingerprinting, tracking and content blocker detection scripts (using the fact the scripts/ads are blocked to deny access) are clearly malware: they are run automatically and without user's consent, so they do unauthorized (I as an operator of a computer and a browser don't want them using resources of my PC to operate because they are not in my best interest, so I don't authorize them, and so do most of users (even though they don't know of them, it doesn't really matter, an average person may know nothing about worms, but when attacked and infected by them, it is still unauthorized access and worms are still considered malware) ) access to some kind of information (fingerprintable features in the case of fingerprinting, mouse and keyboard and scroll activity and cookies in the case of tracking scripts, DOM structure of a page in the case of content blockers detectors). It worth nothing that in the ToS it is written that users give consent, because the access is done before a person can give an informed consent.

The only reliable way to block such stuff in current browsers is to disable JavaScript.

KaKi87 commented 4 years ago

Just use uBlock.

KOLANICH commented 4 years ago

I already use it, and a bunch of other tools. But it doesn't solve the issue of website owners intentionally or nonintentionally counteracting the usage of the tools.

KaKi87 commented 4 years ago

Use Fuck FuckAdblock, formerly known as Anti-Adblock Killer.

BurakParsAydin commented 4 years ago

Hey @KOLANICH May I ask if you're still trying to find a proper extension for this? If not, I can create a Chrome extension for this. I haven't created any Chrome extension before, but I'm sure, I will do something.

KOLANICH commented 4 years ago

May I ask if you're still trying to find a proper extension for this?

I am not trying anymore, I am pretty sure that there is no.

If not, I can create a Chrome extension for this. I haven't created any Chrome extension before, but I'm sure, I will do something.

Feel free to do everything you want, I currently have no time for implementing this idea - that's why I have posted it here and not started myself.

Please note: I (and most of people who need such kind of an extension use Firefox, because in it it is possible to disable harmful APIs) use Firefox. Though it may be easy to port since Firefox supports WebExtensions too. But devtools API may have some specifics.

BurakParsAydin commented 3 years ago

Update: I would work on this idea, but I couldn't have much time to do it. Anyone can take over the project now. Cheers!

GANES1998 commented 3 years ago

I am interested, but since I am not a expert would require guidance. Can u please get me the list of documentation I have to refer for this and the general overview or approach of how this could be implemented. Should this be a chrome dev tool ?. I am interested

KOLANICH commented 3 years ago

@GANES1998

The instruction below is mainly for Firefox, but may work in Chromiums with some changes. If your OS is Windows, you would need Firefox Developer Edition, on Linuxes and other free desktop OSes usual Firefox should usually fit.

  1. take some boilerplate. Feel free to take https://github.com/KOLANICH-WebExts/experiment-parse.xpi/tree/master/extension

  2. read Mozilla's docs about extension manifest. At first time you would only need a content script.

  3. Modify the boilerplate and load it as a temporary addon using about:debugging

  4. Get familiar wit writing a content script. It is as easy as writing a userscript. Just get familiar to the following API: <element>.getElementBy*, <element>.querySelectorAll, Array.map, Array.filter, Array.forEach, document.createElement, <element>.style.*, and the most importantly console.log. And of course, JS. Axel Rauschmayer's books (freely available on his website) + MDN are the best. All these APIs are available not only to content scripts, but also to webpages. You can play with them using devtools console.

  5. read https://developer.mozilla.org/en-US/docs/Web/API/CSSStyleSheet

  6. install noscript and set it up to block all the JS. Then install uMatrix. Start visiting websites. Your target at this step is to teach yourself to revert the sabotage manually. So, choose the websites that contain lot of text, CSS-sabotage is usually on on such websites. Docs, manuals, knowledge bases, blogs, stuff like that. The scope of this project is solely CSS-based sabotage. The sign of it is when JS is disabled and you disable CSS for the site in uMatrix, you see the text you should see, but when CSS is enabled (as it should be) and JS is disabled you don't .

  7. You should learn how to detect it and revert it using devtools manually. Open devtools, and go to inspector tab. Then in the left top corner there is a button allowing to select element right on page. Click it and choose the blank area on the page where the text should be. In the tree undisplayable (but not transparent!) elements are grayed out. Select the undisplayable element closest to the root of the tree. Then in the right you will see CSS props. Usually you need opacity, visibility and display; opacity:0 makes ah element transparent, so does visibility: hidden. display: none just makes an element not to be shown. For each CSS property in Firefox dev tool there exists a checkbox allowing to disable it temporarily. Click the checkbox and check if the text has appeared.

  8. Learn to revert the sabotage using JS API.

  9. Using CSSStyleSheet API learn to examine stylesheets and write some code that detects CSS rules matching some template, i.e. opacity: <some low value>.