essandess / adblock2privoxy

Convert adblock config files to privoxy format
https://hackage.haskell.org/package/adblock2privoxy
GNU General Public License v3.0
93 stars 16 forks source link

New feature: duplicate detection #27

Closed wmyrda closed 2 years ago

wmyrda commented 6 years ago

It would be beneficial for privoxy to deal with as few records as possible which would than lead to faster processing. One of the ways to accomplish it is to limit number of created rules. From below example it is obvious that if rules ||displaymarketplace.com^ and ||disqusads.com^ completely block hosts under all circumstances than the rest of the rules may be safely skipped as they would have no effect beyond what the main rule does. At the moment all entries are processed and added to the list by the converter. Same goes for whitelisting rule starting with @@||

||displaymarketplace.com^
||displaymarketplace.com^$third-party
||disqusads.com^
||disqusads.com/ads-iframe/adsnative/
||disqusads.com/ads-iframe/prebid/
||disqusads.com^$third-party

A bit harder example where combining entries into one would also help reducing number of similar entries.

||directrev.com^$popup
||directrev.com^$popup,image,media,object,object-subrequest,other,ping,script,stylesheet,subdocument,websocket,xmlhttprequest
||directrev.com^$third-party

Yet another example that could use simplification to the (_|/)dropdown_ad. entry.

_dropdown_ad.
/dropdown_ad.

Less common case as hosts are expected to end with ^, but converting only the second entry would be beneficial in this case. They are used separately by adblock2privoxy increasing the number of rules without any actual gain

||events.reddit.com
||events.reddit.com^
essandess commented 6 years ago

My preference would be to upstream rule issues, rather than ask a rule translator to handle them.