gaenserich / hostsblock

an ad- and malware-blocking script for Linux
https://github.com/gaenserich/hostsblock
225 stars 28 forks source link
adblocking blocklists dns-cache dnsmasq linux shell

Hostsblock

An ad- and malware-blocking utility for POSIX systems

Contents

  1. Description: Features
  2. Installation: Dependencies, Arch Linux, Other POSIX
  3. Configuration: Edit hostsblock.conf, Enable Timer, Enable Postprocessing
  4. Usage: Configuring sudo, Manual Usage, UrlCheck Usage (examples)
  5. FAQ
  6. News & Bugs: Upgrading to 0.999.8
  7. License

Description

Hostsblock is a POSIX-compatible script designed to take advantage of the /etc/hosts file to provide system-wide blocking of internet advertisements, malicious domains, trackers, and other undesirable content.

To do so, it downloads a configurable set of blocklists and processes their entries into a single HOSTS file.

Hostsblock also provides a command-line utility that allows you to configure how individual websites and any other domains contained in that website are handled.

Features

Installation

Dependencies

Optional dependencies for additional features

Unarchivers to use archive blocklists instead of plain text:

A DNS caching daemon to help speed up DNS resolutions:

If you use 127.0.0.1 as your blocking redirect address (redirecturl in hostsblock.conf), a pseudo-server that serves blank pages to remove boilerplate page and speed up page resolution on blocked domains:

Note that the default configuration gets no benefit from having a pseudo-server

Arch Linux

If you have yaourt installed: yaourt -S hostsblock or yaourt -S hostsblock-git

Or use one of the AUR packages: hostsblock, hostsblock-git

Don't forget to enable and start the systemd timer by running this:

$ sudo systemctl enable --now hostsblock.timer

For Other POSIX Flavors and Distros

The Best and Easiest Way

Please check with your distribution to see if a package is available. If there is not, ask for it or contribute your own!

If you are a package maintainer, let me know so that I can post the instructions here.

The Easy Way

First download the archive here or with curl like so: curl -O "https://github.com/gaenserich/hostsblock/archive/master.zip"

Unzip the archive, e.g. unzip hostsblock-master.zip

Execute the install.sh script as root, which will guide you through installation.

Configuration

By default, the configuration files are included in the /var/lib/hostsblock/config.examples/ directory. Copy them over to /var/lib/hostsblock/ to customize your setup.

Editing hostsblock.conf

Most of the hostsblock configuration is done in the hostsblock.conf. This file is commented really well, so please read through it before first use:

# CACHE DIRECTORY. Directory where blocklists will be downloaded and stored.

#cachedir="$HOME/cache" # DEFAULT

# WORK DIRECTORY. Temporary directory where interim files will be unzipped and
# # processed. This directory will be deleted after hostsblock completes.
#
# #tmpdir="/tmp/hostsblock" # DEFAULT

# FINAL HOSTSFILE. Final hosts file that combines together all downloaded blocklists.

#hostsfile="$HOME/hosts.block" # DEFAULT

# REDIRECT URL. IP address to which blocked hosts will be redirect, either 0.0.0.0 or
# 127.0.0.1. This replaces any entries to 0.0.0.0 and 127.0.0.1. If you run a
# pixelserver such as pixelserv or kwakd, it is advisable to use 127.0.0.1.

#redirecturl="0.0.0.0" # DEFAULT

# HEAD FILE. File containing hosts file entries which you want at the beginning
# of the resultant hosts file, e.g. for loopback devices and IPv6 entries. Use
# your original /etc/hosts file here if you are writing your final blocklist to
# /etc/hosts so as to preserve your loopback devices. Give hostshead="0" to
# disable this feature. For those targeting /etc/hosts, it is advisable to copy
# their old /etc/hosts file to this file so as to preserve existing entries.

#hostshead="0" # DEFAULT

# DENYLISTED SUBDOMAINS. File containing specific subdomains to denylist which
# may not be in the downloaded denylists. Be sure to provide not just the
# domain, e.g. "google.com", but also the specific subdomain a la
# "adwords.google.com" without quotations.

#denylist="$HOME/deny.list" # DEFAULT

# ALLOWLIST. File containing the specific subdomains to allow through that may
# be blocked by the downloaded blocklists. In this file, put a space in front of
# a string in order to let through that specific site (without quotations), e.g.
# " www.example.com" will unblock "http://www.example.com" but not
# "http://subdomain.example.com". Leave no space in front of the entry to
# unblock all subdomains that contain that string, e.g. ".dropbox.com" will let
# through "www.dropbox.com", "dl.www.dropbox.com", "foo.dropbox.com",
# "bar.dropbox.com", etc.

#allowlist="$HOME/allow.list"

# CONNECT_TIMEOUT. Parameter passed to curl. Determines how long to try to
# connect to each blocklist url before giving up.

#connect_timeout=60 # DEFAULT

# RETRY. Parameter passed to curl. Number of times to retry connecting to
# each blocklist url before giving up.

#retry=0 # DEFAULT

# MAX SIMULTANEOUS DOWNLOADS. Hostsblock can check and download files in parallel.
# By default, it will attempt to check and download four files at a time.

#max_simultaneous_downloads=4 # DEFAULT

# BLOCKLISTS FILE. File containing URLs of blocklists to be downloaded,
# each on a separate line. Downloaded files may be either
# plaintext, zip, or 7z files. Hostsblock will automatically
# identify the file type.

#blocklists="$HOME/block.urls"

# REDIRECTLISTS FILE. File containing URLs of redirectlists to be downloaded,
# each on a separate line. Downloaded files may be either
# plaintext, zip, or 7z files. Hostsblock will automatically
# identify the file type.

#redirectlists="" # DEFAULT, otherwise "$HOME/redirect.urls"

# If you have any additional lists, please post a bug report to
# https://github.com/gaenserich/hostsblock/issues 

Enable the systemd service

Don't forget to enable and start the systemd timer with:

$ sudo systemctl enable --now hostsblock.timer

Configure Postprocessing

Hostsblock does not write to /etc/hosts or manipulate any DNS caching daemons anymore. Instead, it will just compile a hosts-formatted file to /var/lib/hostsblock/hosts.block. To make this file actually do work, you have one of two options:

OPTION 1: Using a DNS Caching Daemon (Here: dnsmasq)

Using a DNS caching daemon like dnsmasq offers better performance.

To use hostsblock together with dnsmasq, configure dnsmasq as DNS caching daemon. Please refer to your distribution's manual. For ArchLinux read the following: Wiki section.

After that, add the following line to dnsmasq.conf (usually under /etc/dnsmasq.conf) so that dnsmasq will reference the file:

addn-hosts=/var/lib/hostsblock/hosts.block

Enable and start hostsblock-dnsmasq-restart.path:

$ sudo systemctl enable --now hostsblock-dnsmasq-restart.path

This has systemd watch the target file /var/lib/hostsblock/hosts.block for changes and then restart dnsmasq whenever they are found.

OPTION 2: Copy /var/lib/hostsblock/hosts.block to /etc/hosts

It is possible to have systemd overwrite /etc/hosts with the generated file.

Configure hostshead= in hostsblock.conf to make sure you don't remove the default system loopback address(es), e.g.:

hostshead="/var/lib/hostsblock/hosts.head"

Then put your necessary loopback entries in /var/lib/hostsblock/hosts.head. For example, you can copy over your existing /etc/hosts to this file:

$ sudo cp /etc/hosts /var/lib/hostsblock/hosts.head
$ sudo chown hostsblock:hostsblock /var/lib/hostsblock/hosts.head
$ sudo chmod 600 /var/lib/hostsblock/hosts.head

Enable and start hostsblock-hosts-clobber.path:

$ sudo systemctl enable --now hostsblock-hosts-clobber.path

This has systemd watch the target file /var/lib/hostsblock/hosts.block for changes and then copy /var/lib/hostsblock/hosts.block to /etc/hosts.

Usage

In its normal systemd-job configuration, hostsblock requires no interaction from the user aside from the steps above. If, however, you want to manually run the process, or to use the UrlCheck tool (hostsblock -c URL), you need to configure sudo:

Configuring sudo

Because hostsblock executes as a heavily sandboxed unpriviledged user (instead of root), you must configure sudo to allow other users to manually execute it.

To do so, edit sudoers by typing sudo visudo and add the following line to the end:

%hostsblock ALL =   (hostsblock)    NOPASSWD:   /usr/lib/hostsblock.sh

Add any users you want to be able to manually execute or use the urlcheck mode to the hostsblock group:

$ sudo gpasswd -a [MY USER NAME] hostsblock

The wrapper script installed in your PATH will automatically use sudo to execute the main script as the user hostsblock.

hostsblock [OPTION...] - download and combine HOSTS files

Without the -c URL option, hostsblock will check to see if its monitored blocklists have changed. If it detects changes (or if forced by the -u flag), it will download the changed blocklist(s) and recompile the target HOSTS file.

Help Options:
  -h                            Show help options

Options:
  -f CONFIGFILE         Specify an alternative configuration file
  -q                    Show only fatal errors
  -v                    Be verbose
  -d                    Be very verbose/debug
  -u                    Force hostsblock to update its target file

hostsblock [OPTION...] -c URL [COMMANDS...] - Manage how URL is handled

With the -c URL flag option, hostsblock can check and manipulate how it handles specific domains.

Note: The hostsblock-urlcheck symlink is now officially depreciated. Use hostsblock -c instead.

In addition to the above options, the following commands and subcommands can be used with hostsblock -c URL:

hostsblock -c URL (urlCheck) Commands:
  -s [-r -k]            State how hostblock modifies URL
  -b [-o -r]            Temporarily (un)block URL
  -e [-o -r -b]         Add/remove URL to/from denylist
  -a [-o -r -b]         Add/remove URL to/from allowlist
  -i [-o -r -k]         Interactively inspect URL

hostsblock -c URL Command Subcommands:
  -r                    COMMAND recurses to all domains on URL's page
  -k                    COMMAND recurses for all BLOCKED domains on page
  -o                    Perform opposite of COMMAND (e.g UNblock)
  -b                    With "-e", immediately block URL
                        With "-a", immediately unblock URL

Note that the -o subcommand turns a command into its opposite, e.g.

Examples:

Once you have configured sudo, you can execute the following as any user in the hostsblock group:

See if "http://github.com/gaenserich/hostsblock" is blocked, denylisted, allowlisted, or redirected by hostsblock:
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -s
Do the same thing for any of the sites referenced on this page:
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -s -r
Do the same thing for any of the sites referenced on this page that are presently blocked:
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -s -k
Block the domain containing "http://github.com/gaenserich/hostsblock" (that is, "github.com"):
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -b

Note that "blocking" (and "unblocking", i.e. -b -o) a domain only works until the next time hostsblock refreshes /var/lib/hostsfile/hosts.block, unless you use a blocklist that does include it. To permanently block this domain, use the denylist (-e) command.

Permanently block (denylist) the domain containing "http://github.com/gaenserich/hostsblock" (that is, "github.com"):
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -e

Note that "denylisting" on its own will not block the target domain until hostblock refreshes. You can combine both "blocking" and "denylisting" in one command, however:

Permanently and immediately block the domain containing "http://github.com/gaenserich/hostsblock" (that is, "github.com"):
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -e -b
Temporarily unblock all blocked domains on "http://github.com/gaenserich/hostsblock" (helpful if the page isn't working quite right):
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -b -o -k
Interactively scan through "http://github.com/gaenserich/hostsblock", prompting you if you want the domains referenced therein to be blocked, denylisted, or allowlisted
$ hostsblock -c "http://github.com/gaenserich/hostsblock" -i -r

FAQ

News & Bugs

Upgrading to 0.999.8

For existing hostsblock users, please note the following changes in version 0.999.8:

Changes in hostsblock.conf

Due to the shift to POSIX-shell compatibility, the list of blocklists to be downloaded cannot be held in hostsblock.conf via the blocklists= parameter. Instead, this parameter contains the path to a file that contains the list of URLs, e.g. /var/lib/hostsblock/block.urls.

The new block.urls file is simply a newline separated list of URLs without quotations. Whitespace and text after # are ignored. An example block.urls file could look like this:

http://hosts-file.net/download/hosts.zip # General blocking meta-list
http://winhelp2002.mvps.org/hosts.zip

http://hostsfile.mine.nu/Hosts.zip

See the example block.urls in the /var/lib/hostsblock/config.examples directory for details.

No more postprocessing within script

Due to enhanced security and sandboxing, hostsblock no longer handles postprocessing on its own. Instead, users should use other systemd capabilities to replace the postprocess() {} functionality.

Hostsblock comes with systemd service files that replicate the most common scenarios. See the directions above for instructions on how to enable them.

Changes with sudo

sudo is no longer as widely used as before. The main systemd service no longer requires it. You only need it if you want to use the hostsblock -c URL (urlcheck) utility. See the above directions for details.

Other Caveats

Other Changes from 0.999.7 to 0.999.8

Systemd Job Improvements
POSIX-Compatibility Improvements
UrlCheck Mode Improvements

License

Hostsblock is licensed under GNU GPL