Currently, a lot of the extractors use the python adblockparser module.
This module has not been updated since 2016, it would be nice to transition to some more up to date package.
In addition, adblockparser is currently configured to use the python builtin re library. Transitioning from that to e.g. googles re2 library could also provide significant speedup. E.g. for the https://ebay.de advertisment extractor takes ~24 seconds with the re module and only ~6 seconds with the re2 module. I think even in a single test run, these numbers are significant enough.
As a last thing, the extractors usually only want to know if there is any match, and do not care about the full set of matched (to be blocked) links. Hence we can abort the extraction once some to be blocked links were found.
Alternative package option:
adblock A python wrapper around a rust library that seems to do the same stuff as adblockparser. I expect a significant speedup, as that library does not use on huge regex for its checks.
Currently, a lot of the extractors use the python adblockparser module.
re2
library could also provide significant speedup. E.g. for thehttps://ebay.de
advertisment extractor takes ~24 seconds with there
module and only ~6 seconds with there2
module. I think even in a single test run, these numbers are significant enough.Alternative package option: