Closed sentialx closed 5 years ago
Hi @sentialx,
Thanks for opening the issue. These timings definitely seem extremely high and that is surprising. The micro-benchmarks that we run before releases do not show any obvious regression so I need to investigate a bit more. Could you point me to the repository and/or some instructions so that I can run your project on my side?
In the future, I'd like to add a simple Electron example here which could be used both as documentation and integration tests to make sure adblocking works as expected in the different environment we aim at supporting. (we already have webextension and puppeteer there).
Best, Rémi
Sure, https://github.com/wexond/wexond You can either download a 2.2.0 release, or run it by yourself following the Running section.
After downgrade to 0.9.1, tested on https://wp.pl (since it's bloated with ads):
Thanks. Sorry I am not very familiar with Electron environment. I was able to start the browser locally but not sure where I can access the logs emitted from the adblock
service. Is there any specific devtool I need to open? Is the adblocker service enabled by default?
I didn't push a commit with the console.time
benchmarks, but logs from main process (that's where the adblock service is) are in terminal where you run npm run dev
.
I added some logs manually. In the terminal I get some exceptions:
(node:5381) UnhandledPromiseRejectionWarning: Error: ENOENT: no such file or directory, scandir '/home/user/.wexond/extensions'
(node:5381) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:5381) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
I assume I might need to create a config in /home/user/.wexond/extensions
before running the project?
It throws the error in the first run, but it does nothing. If you apply some changes to the main process, you need to restart npm run dev
.
Ok if I create this extensions
directory manually I start seeing the logs.
As far as I can see it seems that requests which are not blocked take a huge amount of time to be "processed". I will investigate more to pin-point the exact issue.
@sentialx Here is what is happening. After 0.9.1
, we fixed an issue in the way we dispatch filters (a.k.a. tokenization). Filters such as foo.com
cannot be assumed to be hostnames because we don't know what the intent of the list creator was. It could be that this filter needs to match URLs like https://somed.domain?query=blablafoo.com/
where this sub-string can appear anywhere. To be conservative and make sure we don't miss this match, the adblocker needs to check the appearance of foo.com
substring in all requests. The issue is that the list http://mirror1.malwaredomains.com/files/justdomains
is not making use of optimal syntax for hostname blocking. For example the following filter 5bw.ru
should instead be written as ||5bw.ru^
to specify that 5bw.ru
is in fact a hostname and should match accordingly. This will allow the adblocker to be much more efficient at handling these.
What I would recommend is to either transform this list to use the usual filters syntax, or to stop using this particular list. Maybe the adserver list could be a good replacement. Alternatively, you could adopt the following base lists which I know are well-maintained and offer great coverage:
const lists = {
easyprivacy: 'https://easylist.to/easylist/easyprivacy.txt',
easylist: 'https://easylist.to/easylist/easylist.txt',
'plowe-0': 'https://pgl.yoyo.org/adservers/serverlist.php?hostformat=adblockplus&showintro=0&mimetype=plaintext',
'ublock-badware': 'https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/badware.txt',
'ublock-filters': 'https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/filters.txt',
'ublock-unbreak': 'https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/unbreak.txt',
'ublock-privacy': 'https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/privacy.txt',
'ublock-abuse': 'https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/resource-abuse.txt',
};
The ublock-abuse
fulfills a similar purpose as compared to the nocoin
list.
To benefit from redirect rules you might also need to download the content of ublock-resources
and initialize the engine like so;
engine = FiltersEngine.parse(data); // the code you already have does not need to be changed
// this needs to be added
const resources = (await Axios.get('https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/resources.txt')).data;
engine.updateResources(resources, '' + resources.length);
I hope that helpers!
Also, I would really like the adblocker to offer a more robust support for Electron out of the box. Similarly to what was done recently for puppeteer, we could create a specific wrapper + example. See the puppeteer wrapper and puppeteer example. Since I am not extremely familiar with this ecosystem, I would love to get some outside contribution here; if that is something you would consider doing, I think it would be an awesome addition for the project + would allow me to make sure that Electron support is great for future releases.
Thank you for taking your time for investigating this issue. It would never come to my mind that it's a particular list fault.
Also, I would really like the adblocker to offer a more robust support for Electron out of the box...
Sure, I will take a look at this.
Btw, I'm not sure if I implemented cosmetic filtering correctly. It injects the styles after did-start-navigation
event fired, but as I can see in the puppeteer wrapper it's different.
if (engine && settings.isShieldToggled) {
if (url === '') return;
const { styles, scripts } = engine.getCosmeticsFilters({
url,
...parse(url),
});
this.webContents.insertCSS(styles);
for (const script of scripts) {
this.webContents.executeJavaScript(script);
}
}
To make optimal use of cosmetics (both in terms of performance and coverage), the handling needs to be slightly more involved. The way to do it optimally depends on the capabilities. If we have a way to inject "content scripts" in pages via Electron APIs, then you can implement the full capabilities like in WebExtensions. The ideal scenario with cosmetics is this one:
MutationObserver
in the web extension context and maybe there is a way to do the same for Electron.In puppeteer we only do 1.
and 2.
at the moment. But probably we can do 3.
for Electron.
I can see that there is a way to inject scripts using the setPreloads(...)
API (https://electronjs.org/docs/api/session#sessetpreloadspreloads). If that injects scripts in all frames and if there is a way to send messages from these scripts to the main process, then I believe we should be able to implement the full cosmetics (I think no one did that on Electron before?).
Well, the setPreloads
API doesn't inject scripts in all frames, only in the main frame, however this can be done in the preload
script. Messaging works without a problem.
(I think no one did that on Electron before?)
Correct. The only available adblocker for Electron is Brave/ad-block, but it doesn't support cosmetic filtering.
Correct. The only available adblocker for Electron was Brave/ad-block, but it doesn't support cosmetic filtering.
Then it would be awesome to figure this out!
I suggest to look at the way it's currently done for WebExtensions. Here is the part running in frames: content.ts. It's injected in all frames and communicate with the background.ts using a simple protocol. Some parts could probably be abstracted away so that the same logic can be used also for Electron.
Hi, I noticed after update (0.10 I guess) the
engine.match
method became super-slow. Here's the code I'm using in my Electron app, and it literally hangs up: