essandess / adblock2privoxy

Convert adblock config files to privoxy format
https://hackage.haskell.org/package/adblock2privoxy
GNU General Public License v3.0
93 stars 16 forks source link

translated records get dot in front which has adverse effects #23

Closed wmyrda closed 6 years ago

wmyrda commented 6 years ago

||www.twojapogoda.pl/templates/rodo/rodo.js - original .www.twojapogoda.pl/templates/rodo/rodo\.js - converted Rule has no effect unless it has no dot in the front

||log. - original .log.*. - converted It was intended to block pages as log.mypage.com but blocks pages as krolowasuperstarblog.wordpress.com.

My idea to fix both of those cases is to add ^ in front instead of . as this stands for start of the string in regular expression and I tested that it works in both of those cases properly.

wmyrda commented 6 years ago

I actually think I have solved that one within the code. I would have to test it, but seems (for now ;) ) this change has no unintended effects and fixes this bug and partially even https://github.com/essandess/adblock2privoxy/issues/19

diff -Naur adblock2privoxy-9999.old/adblock2privoxy/src/PatternConverter.hs adblock2privoxy-9999/adblock2privoxy/src/PatternConverter.hs
--- adblock2privoxy-9999.old/adblock2privoxy/src/PatternConverter.hs    2018-07-21 10:18:41.934764472 +0200
+++ adblock2privoxy-9999/adblock2privoxy/src/PatternConverter.hs        2018-07-21 13:37:30.146370607 +0200
@@ -47,7 +47,7 @@
                     changeFirst (first:cs)
                         | first == '*'                       =       '.' :  '*'  : cs
                         | bindStart == Hard || proto /= ""   =             first : cs
-                        | bindStart == Soft                  =       '.' : first : cs
+                        | bindStart == Soft                  =       '^' : first : cs
                         | otherwise                          = '.' : '*' : first : cs

         query' = case query of
wmyrda commented 6 years ago

Turns out there using ^ in front is a bit drastic measure to fix this problem. Yes ||log. entry would no longer block entries such as ||blog., but it also would not block subdomains as test.log.. Prime example of it is https://github.com/MajkiIT/polish-ads-filter/issues/8816 where ||tvp.pl/video/vod/reklamy/$domain=tvp.pl should work for r.tvp.pl and several similar subdomains, but ^ stops privoxy from doing so.

Maybe adding \. in front instead would be more appropriate.

essandess commented 6 years ago

The first example is correctly translated into a Privoxy action. See https://www.privoxy.org/user-manual/actions-file.html.

The other issues are fixed in #10.