henrikwidlund / hostsparser

Tool that converts hosts based files into a AdBlock formatted file, optimized for AdGuard Home.
GNU General Public License v3.0
5 stars 2 forks source link
adblock adguarddns adguardhome hosts hostsfile stevenblackhost

HostsParser

Publish Filter CI CodeQL Docker Qodana codecov.io

Tool for producing an AdBlock formatted file from different sources. hosts and AdBlock based formats are supported for the sources and you can specify if the contents in the sources should be excluded or included in the result. It also removes duplicates, comments as well as hosts as well as hosts that would otherwise be blocked by a more general entry.

By default StevenBlack/hosts with fakenews, gambling and porn extensions is processed to exclude entries already covered by the AdGuard DNS Filter file.

Note The file the program produces can't be used as a regular hosts file, it must be used with a system that supports the AdBlock format.

How to use with AdGuard Home

Pre-built filters

The filter files are generated every six hours and are available for download in the table below. You are welcome to create a feature request should you want more pre-built filters.

Filter Link
Unified hosts = adware + malware link
Unified hosts + fakenews link
Unified hosts + fakenews + gambling link
Unified hosts + fakenews + gambling + porn link
Unified hosts + fakenews + gambling + porn + social link
Unified hosts + fakenews + gambling + social link
Unified hosts + fakenews + porn link
Unified hosts + fakenews + porn + social link
Unified hosts + fakenews + social link
Unified hosts + gambling link
Unified hosts + gambling + porn link
Unified hosts + gambling + porn + social link
Unified hosts + gambling + social link
Unified hosts + porn link
Unified hosts + porn + social link
Unified hosts + social link

Adding the filters via UI

Adding the filter

  1. Make sure that AdGuard DNS filter (or the custom AdBlock formatted file referenced when running the program) is enabled in DNS blocklists for your AdGuard Home instance.
    • If the filter isn't added, scroll down to the bottom of the page and click on Add blocklist.
    • Select Choose from the list.
    • Finally select AdGuard DNS filter and click Save.
  2. Copy the link to the Pre-built filter and add it to your DNS blocklists as a custom list in your AdGuard Home instance by repeating the instructions in step 1, except this time, choose Add a custom list instead of Choose from the list. In the dialog that appears, enter a name of your choosing and the URL to it. Click on Save.

Adding the filters via YAML

Open and edit the AdGuardHome.yaml file, scroll down to the section filters.

  1. Make sure that the AdGuard DNS filter is enabled (or the custom AdBlock formatted file referenced when running the program)
    filters:
    - enabled: true
      url: https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt
      name: AdGuard DNS filter
      id: 1
  2. Add the Pre-built filter, replace the id value with Unix Time.
    filters:
    - enabled: true
      url: https://henrikwidlund.github.io/hostsparser/filter.txt
      name: HostsParser
      id: 1621690654
  3. Restart the service.

Please refer to the AdGuard Home Wiki for further details on DNS blocklists.

Note If you've generated your own file, the Pre-built filter link should be replaced by the address to where you host your generated file.

Building from source code

Prerequisites

dotnet 9 SDK.

Run the following from the directory you cloned the repository to:

Linux/macOS

./build.sh

Windows

build.cmd

The built files will be put in the artifacts directory, in the root of the repository.

Running

Prerequisites

  1. dotnet 9 runtime.
  2. Downloaded binaries or binaries built from source code.

Run the following (if you built from source code, this will be in artifacts directory, in the root of the repository):

dotnet HostsParser.dll

The program creates the filter.txt file in the same directory.

Docker

You can build and run the program with Docker.

Build

docker build ./src/HostsParser

Run

Docker Hub

Images are available on Docker Hub.

docker pull henrikwidlund/hostsparser \
    && docker create --name hostsparser henrikwidlund/hostsparser \
    && docker start hostsparser \
    && docker wait hostsparser \
    && docker cp hostsparser:/app/filter.txt . \
    && docker rm -f hostsparser

The filter.txt file will be put into the current directory.

Run from source code

If you'd rather build and run from source code, execute the following from the repository root:

IMAGE_ID=$(docker build ./src/HostsParser -q -t 'hostsparser') \
    && docker create --name hostsparser $IMAGE_ID \
    && docker start hostsparser \
    && docker wait hostsparser \
    && docker cp hostsparser:/app/filter.txt . \
    && docker rm -f hostsparser

The filter.txt file will be put into the repository root.

Configuration

You may adjust the configuration of the application by modifying the appsettings.json file.

Property Type Required Description
Filters object true Settings used for processing hosts formatted sources.
ExtraFiltering bool true Setting to indicate if extra filtering should be performed.
If true, the program will check each element in the result against each other and remove any entry that would be blocked by a more general entry.
MultiPassFilter bool true If set to true the final results will be scanned multiple times until no duplicates are found. Default behaviour assumes duplicates are removed after one iteration.
HeaderLines string[] true Defines a set of lines that will be inserted at the top of the generated file, for example copyright.
KnownBadHosts string[] true Array of unwanted hosts. These entries will be added to the result if they're not covered by the AdBlockBased entries.
You can also add generalized hosts to reduce the number of entries in the final results.
For example: HostsBased results might contain a.baddomain.com and b.baddomain.com, adding baddomain.com will remove the sub domain entries and block baddomain.com and all of its subdomains.
OutputFileName string false Defines the name of the file produced by the program. Defaults to filter.txt.

Filters

Property Type Required Description
Sources object[] true Array of SourceItem used for fetching and processing filters.
SkipLines string[] true Array of strings that, if present in the result from Sources will be filtered out.

SourceItem

Property Type Required Description
Uri Uri true The Uri to fetch data from.
Prefix string false Prefix used in the source, for example 127.0.0.1 or 0.0.0.0.
Format enum true The format of the source. Possible values Hosts, AdBlock.
SourceAction enum true Defines if the data from the source should be combined or excluded. Possible values Combine, ExternalCoverage.

Licenses