openzim / zimit

Make a ZIM file from any Web site and surf offline!
GNU General Public License v3.0
355 stars 24 forks source link

Remove cookie banners #276

Open kelson42 opened 9 months ago

kelson42 commented 9 months ago

It would be nice to have an option removing automatically the cookie banners. They are annoying and don't make sense offline.

kelson42 commented 9 months ago

Would that be possible using one of the extension dedicated to that task in the browser during the crawling?

kelson42 commented 4 months ago

@benoit74 A small feedback about the feasibility would be welcome :)

benoit74 commented 4 months ago

I'm not sure about how this could work. AFAIK, extensions are manipulating the DOM and/or adding custom CSS, so this will not help since the crawler is recording HTTP responses, not the DOM. What we need is more probably something rewriting the HTML and/or JS and/or CSS to remove these banners. Or maybe just some additional JS running on all pages and doing the same as extension. I need to spend time looking at how these extension work in more details, and how we can integrate this.

ikreymer commented 4 months ago

What you can do is use one of the lists here: https://easylist.to/ probably the Cookie Privacy List to inject css or exclude certain matching resources outright.

We've started doing that in wabac.js for ads, since many are getting removed at crawl time as Brave uses these as well, but the replay attempts to load / doen't hide the space for the ads. (Current implementation: https://github.com/webrecorder/wabac.js/blob/main/src/adblockcss.js#L30)

Haven't tested the cookie popups as much as ads so far, though.

The rules are explained here: https://help.adblockplus.org/hc/en-us/articles/360062733293-How-to-write-filters but its not as complicated as it seems, as most of the rules are either css selectors (ones that contain ##) or URL patterns that should be excluded. Currently, we're only injecting css selectors with display: none.

Would be curious to see how this works for you if you try this out!

benoit74 commented 4 months ago

Thank you a lot Ilya! I will have a look