essandess / adblock2privoxy

Convert adblock config files to privoxy format
https://hackage.haskell.org/package/adblock2privoxy
GNU General Public License v3.0
93 stars 16 forks source link

converter does not recognise rules with leading TAB #9

Closed wmyrda closed 6 years ago

wmyrda commented 6 years ago

It turns out there are filters such as this one which for readability start off with leading tab. Adblock2privoxy creates many empty rules for those leading to privoxy block almost anything rendering Internet useless.

As workaround to this I used below code which downloads ruleset locally and prepares it for adblock2privoxy prior to being used by converter.

wget https://raw.githubusercontent.com/maciejtarmas/AlleBlock/master/alleblock.txt
sed -i -e 's/^[ \t]*//' alleblock.txt
adblock2privoxy http://local.website.com/alleblock.txt

I am not sure how should that be implemented in converter, but booking at the code the I found out InputParser.hs has following

lineSpaces :: Parser ()
lineSpaces = skipMany (satisfy isLineSpace) <?> "white space"
    where isLineSpace c = c == ' ' || c == '\t'

Maybe some changes to include tabs there could take care of this?

essandess commented 6 years ago

Please upstream issues like broken rules in other repos.

wmyrda commented 6 years ago

How are those rules broken? They work just fine in adblock/ublock

essandess commented 6 years ago

I must have missed the issue. Please excerpt an example that illustrates the issue.

wmyrda commented 6 years ago

Filter file has leading tab for rules https://github.com/maciejtarmas/AlleBlock/blob/master/alleblock.txt

! Strona główna

    allegro.pl##div[data-box-name="Showcase main"]
    allegro.pl##div[data-box-name="Showcase brand and marketing"]
    allegro.pl##div[data-box-name="reklamy APE"]

In such a case a2p creates empty rules instead creating appropriate rules for element hiding. A2P expects rules to start with the beginning of the line and hence it finds TAB there it uses it as the rule disregarding the rest of line. One would expect in such case TAB would be skipped.