gaenserich / hostsblock

an ad- and malware-blocking script for Linux
https://github.com/gaenserich/hostsblock
225 stars 28 forks source link

Default configuration: yandex.ru (site #1 in Russia) is blocked #50

Closed petRUShka closed 8 years ago

petRUShka commented 8 years ago

After installing hostblock on my system and running it with default config I found that yandex.ru is blocked.

For ex-USSR users (especially in Russia) It is like block google.com by default. It is main search engine and etc.

I don't know yet which blacklist contains that entry, but it definitely should be turned off by default.

Sadi58 commented 8 years ago

This is what the whitelist is for. If you enter any such url in that file, you won't even need to worry which blacklist server blocks it.

Sadi58 commented 8 years ago

This gave me an idea: querying blocklists in the cache for an entry. And I've come up with the zenity-powered bash script below. This may perhaps be improved to automatically get the cache directory and blocklist server addresses from hostsblock.conf. Note that the user should start the entry with "\s" (e.g. "\sdomain.com") if they don't want all other variations (e.g. "other.domain.com") as well.

#!/bin/bash
if [ ! -d "/tmp/hostsblock-lists" ]
then
    mkdir "/tmp/hostsblock-lists"
fi
cd "/tmp/hostsblock-lists"
if [ ! -f "/tmp/hostsblock-lists/hosts-file.net.ad_servers.txt" ]
then
    cp "/var/cache/hostsblock/hosts-file.net.ad_servers.txt" "/tmp/hostsblock-lists/"
fi
if [ ! -f "/tmp/hostsblock-lists/pgl.yoyo.org.as.serverlist.php.hostformat.hosts.mimetype.plaintext" ]
then
    cp "/var/cache/hostsblock/pgl.yoyo.org.as.serverlist.php.hostformat.hosts.mimetype.plaintext" "/tmp/hostsblock-lists/"
fi
if [ ! -f "/tmp/hostsblock-lists/someonewhocares.org.hosts.hosts" ]
then
    cp "/var/cache/hostsblock/someonewhocares.org.hosts.hosts" "/tmp/hostsblock-lists/"
fi
if [ ! -f "/tmp/hostsblock-lists/sysctl.org.cameleon.hosts" ]
then
    cp "/var/cache/hostsblock/sysctl.org.cameleon.hosts" "/tmp/hostsblock-lists/"
fi
if [ ! -f "/tmp/hostsblock-lists/Hosts" ]
then
    unzip -qq "/var/cache/hostsblock/hostsfile.mine.nu.Hosts.zip" Hosts
fi
if [ ! -f "/tmp/hostsblock-lists/BadHosts.unx/hosts.lnx" ]
then
    unzip -qq "/var/cache/hostsblock/hostsfile.org.Downloads.BadHosts.unx.zip" BadHosts.unx/hosts.lnx
fi
if [ ! -f "/tmp/hostsblock-lists/HOSTS" ]
then
    unzip -qq "/var/cache/hostsblock/winhelp2002.mvps.org.hosts.zip" HOSTS
fi
look_for=`zenity --entry --title="Search URL" --text="Check blocklists for this entry:"`
check1=$(grep "$look_for" "/tmp/hostsblock-lists/hosts-file.net.ad_servers.txt" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check2=$(grep "$look_for" "/tmp/hostsblock-lists/pgl.yoyo.org.as.serverlist.php.hostformat.hosts.mimetype.plaintext" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check3=$(grep "$look_for" "/tmp/hostsblock-lists/someonewhocares.org.hosts.hosts" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check4=$(grep "$look_for" "/tmp/hostsblock-lists/sysctl.org.cameleon.hosts" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check5=$(grep "$look_for" "/tmp/hostsblock-lists/Hosts" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check6=$(grep "$look_for" "/tmp/hostsblock-lists/BadHosts.unx/hosts.lnx" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
check7=$(grep "$look_for" "/tmp/hostsblock-lists/HOSTS" | sed -e "s/^#..*$//g" -e "s/^0\.0\.0\.0//g" -e "s/^127\.0\.0\.1//g" -e "s/\t//g" | tr '\n' ' ' | tr '\r' ' ')
zenity --info --title="Search results for \"$look_for\"" --text="<b>hosts-file.net.ad_servers:</b> $check1\n<b>pgl.yoyo.org.as.serverlist:</b> $check2\n<b>someonewhocares.org.hosts:</b> $check3\n<b>sysctl.org.cameleon.hosts:</b> $check4\n<b>hostsfile.mine.nu.Hosts:</b> $check5\n<b>hostsfile.org.hosts.lnx:</b> $check6\n<b>winhelp2002.mvps.org.HOSTS:</b> $check7" --height 300 --width 500
rm -r /tmp/hostsblock-lists
gaenserich commented 8 years ago
$ hostsblock-urlcheck "yandex.ru"
Checking to see if url is blocked or unblocked...

'yandex.ru' BLOCKED by blocklist(s) /etc/hosts.block.old
        1) Unblock/unredirect just yandex.ru
        2) Unblock/unredirect all sites containing url yandex.ru
        3) Keep blocked/redirected
1-3 (default: 3): 1
Unblocking just yandex.ru
Page domain verified. Scan the whole page for other domains for (un)blocking? [y/N] y
Whole-page scan completed.

The (not so) new database feature allows you to find out which blocklist is the culprit. Here we see "/etc/hosts.block.old" is responsible, which means that one of the other blocklists once listed yandex.ru for blocking, but later removed it, and you have "recycle_old=1" enabled in /etc/hostsblock/hostsblock.conf, which reuses entries from your previous hosts.block file for the new one. If you either change "recycle_old=1" to "recycle_old=0" or run hostsblock-urlcheck on this url (as done above), you will be able to rectify this.

pickfire commented 8 years ago

@gaenserich I think we should probably give a summary and remove the useless redirect wording, for example:

$ hostsblock-urlcheck "yandex.ru"
Checking to see if url is blocked or unblocked...

'yandex.ru' BLOCKED by blocklist(s) /etc/hosts.block.old
        1) Unblock/unredirect just yandex.ru
        2) Unblock/unredirect all sites containing url yandex.ru
        3) Keep blocked/redirected
1-3 (default: 3): 1
Unblocking just yandex.ru
Page domain verified. Scan the whole page for other domains for (un)blocking? [y/N] y
Whole-page scan completed.
$ hostsblock-urlcheck "yandex.ru"
'yandex.ru' BLOCKED by blocklist(s) /etc/hosts.block.old
        1) Unblock just yandex.ru
        2) Unblock all sites containing yandex.ru (8)
        3) Cancel
1-3 (default: 3): 1
Unblocking just yandex.ru
Page domain verified. Scan the whole page for other domains for (un)blocking? [y/N] y