jacklul / pihole-updatelists

Update Pi-hole's lists from remote sources easily
MIT License
1.42k stars 82 forks source link

Spaces not working #21

Closed biship closed 4 years ago

biship commented 4 years ago

Debug does not add any more information.

/etc/pihole-updatelists.conf:

; Remote list URL containing adlists
ADLISTS_URL="https://v.firebog.net/hosts/lists.php?type=tick https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt htt
ps://hosts-file.net/ad_servers.txt"

; Remote list URL containing exact domains to whitelist
WHITELIST_URL="https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt"

; Remote list URL containing regex rules for whitelisting
REGEX_WHITELIST_URL=""

; Remote list URL containing exact domains to blacklist
BLACKLIST_URL=""

; Remote list URL containing regex rules for blacklisting
REGEX_BLACKLIST_URL="https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list"

Fetching ADLISTS from 'https://v.firebog.net/hosts/lists.php?type=tick https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://hosts-file.net/ad_servers.txt'... failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request

Fetching WHITELIST from 'https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt'... failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request

Fetching REGEX_BLACKLIST from 'https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list'... done (15 entries)

biship commented 4 years ago

oh, i might have mixed up adlists and whitelists. Will verify.

biship commented 4 years ago

Still having issues. Does it support URLs with 0.0.0.0?

/etc/pihole-updatelists.conf:

; Remote list URL containing adlists
ADLISTS_URL="https://v.firebog.net/hosts/lists.php?type=tick"

; Remote list URL containing exact domains to whitelist
WHITELIST_URL="https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts ht
tps://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block"

; Remote list URL containing regex rules for whitelisting
REGEX_WHITELIST_URL=""

; Remote list URL containing exact domains to blacklist
BLACKLIST_URL=""

; Remote list URL containing regex rules for blacklisting
REGEX_BLACKLIST_URL="https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list"

Fetching WHITELIST from 'https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block'... failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request

jacklul commented 4 years ago

Debug does not add any more information.

You might be running old version of the script, tried updating via install script?

Also why are you putting adlists into WHITELIST_URL (I see OISD list there as well), you're doing it wrong. WHITELIST_URL is for lists that contain domains to be whitelisted.

Still having issues. Does it support URLs with 0.0.0.0?

For adlist list - one adlist URL per line, for whitelist/blacklist - one domain per line.

biship commented 4 years ago

Ok, I fixed my conf file, same error. Even my 2 whitelists are not working.

; For more information and help check:
; github.com/jacklul/pihole-updatelists
; ----------------------------------------

; Remote list URL containing adlists
ADLISTS_URL="https://v.firebog.net/hosts/lists.php?type=tick"

; Remote list URL containing exact domains to whitelist
WHITELIST_URL="https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt"

; Remote list URL containing regex rules for whitelisting
REGEX_WHITELIST_URL=""

; Remote list URL containing exact domains to blacklist
BLACKLIST_URL="https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt https://www.github.developerdan.com/hosts/lists/facebook-extended.txt https://zerodot1.gitlab.io/CoinBlockerLists/hosts"

; Remote list URL containing regex rules for blacklisting
REGEX_BLACKLIST_URL="https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list"

; OPTIONAL PARAMETERS (and their default values)
; To change them you have to uncomment them first (remove prefixing ';')

; Comment string used to know which entries were created by the script
COMMENT="Managed by pihole-updatelists"

; All inserted adlists and domains will have this additional group ID assigned
;  0 is the default group to which all entries are added anyway
;GROUP_ID=0

; Prevent touching entries not created by this script by comparing comment field
;REQUIRE_COMMENT=true

; Update gravity after lists are updated?
;  Runs `pihole updateGravity`, when disabled will invoke simple lists reload instead
UPDATE_GRAVITY=false

; Vacuum database at the end?
;  Runs `VACUUM` SQLite command
VACUUM_DATABASE=false

; Print more information while script is running?
VERBOSE=true

; Print even more information for debugging purposes
DEBUG=true

; Maximum time in seconds one list download can take before giving up
;  You should increase this when downloads fail
;DOWNLOAD_TIMEOUT=60

; Location of gravity.db file in case you need to change it
;GRAVITY_DB="/etc/pihole/gravity.db"

; Process lockfile to prevent multiple instances of the script
;  You shouldn't change it - unless `/var/lock` is unavailable
;LOCK_FILE="/var/lock/pihole-updatelists.lock"

; Log console output to file
;  Put `-` before path to overwrite file instead of appending to it
;LOG_FILE=""
root@raspberrypi:~# /usr/local/sbin/pihole-updatelists
Acquired process lock through file: /var/lock/pihole-updatelists.lock

      Pi-hole's Lists Updater by Jack'lul
 https://github.com/jacklul/pihole-updatelists

Checksum: ff2ab3fae4bf5b4ae01a32ed999b7ec2
OS: Linux raspberrypi 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l
PHP: 7.3.14-1~deb10u1 NTS
SQLite: 3.27.2
Pi-hole: v5.0-0-g4d25f695 (master)
Web: v5.0-0-gb86e4a31 (master)
FTL: v5.0 (master)
Configuration: array(17) {
  ["CONFIG_FILE"] => string(28) "/etc/pihole-updatelists.conf"
  ["GRAVITY_DB"] => string(22) "/etc/pihole/gravity.db"
  ["LOCK_FILE"] => string(33) "/var/lock/pihole-updatelists.lock"
  ["LOG_FILE"] => string(0) ""
  ["ADLISTS_URL"] => string(47) "https://v.firebog.net/hosts/lists.php?type=tick"
  ["WHITELIST_URL"] => string(170) "https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt"
  ["REGEX_WHITELIST_URL"] => string(0) ""
  ["BLACKLIST_URL"] => string(570) "https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt https://www.github.developerdan.com/hosts/lists/facebook-extended.txt https://zerodot1.gitlab.io/CoinBlockerLists/hosts"
  ["REGEX_BLACKLIST_URL"] => string(71) "https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list"
  ["COMMENT"] => string(29) "Managed by pihole-updatelists"
  ["GROUP_ID"] => int(0)
  ["REQUIRE_COMMENT"] => bool(true)
  ["UPDATE_GRAVITY"] => bool(false)
  ["VACUUM_DATABASE"] => bool(false)
  ["VERBOSE"] => bool(true)
  ["DEBUG"] => bool(true)
  ["DOWNLOAD_TIMEOUT"] => int(60)
}
Options: array(0) {
}

Opened gravity database: /etc/pihole/gravity.db (32.38 MB)

Fetching ADLISTS from 'https://v.firebog.net/hosts/lists.php?type=tick'... done (34 entries)
Processing...
SQL Query: SELECT * FROM `adlist`
SQL Query: SELECT * FROM `adlist` WHERE `enabled` = 1 AND `comment` LIKE "%Managed by pihole-updatelists%"
Ignored: https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts_without_controversies.txt
Ignored: https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts
Ignored: https://v.firebog.net/hosts/static/w3kbl.txt
Ignored: https://adaway.org/hosts.txt
Ignored: https://v.firebog.net/hosts/AdguardDNS.txt
Exists: https://v.firebog.net/hosts/Admiral.txt
Ignored: https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt
Ignored: https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
Ignored: https://v.firebog.net/hosts/Easylist.txt
Ignored: https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext
Ignored: https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts
Ignored: https://raw.githubusercontent.com/bigdargon/hostsVN/master/hosts
Ignored: https://v.firebog.net/hosts/Easyprivacy.txt
Ignored: https://v.firebog.net/hosts/Prigent-Ads.txt
Ignored: https://gitlab.com/quidsup/notrack-blocklists/raw/master/notrack-blocklist.txt
Ignored: https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts
Ignored: https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/hosts/spy.txt
Ignored: https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt
Ignored: https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate%20versions%20Anti-Malware%20List/AntiMalwareHosts.txt
Ignored: https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt
Ignored: https://s3.amazonaws.com/lists.disconnect.me/simple_malvertising.txt
Ignored: https://mirror1.malwaredomains.com/files/justdomains
Exists: https://v.firebog.net/hosts/Prigent-Crypto.txt
Ignored: https://v.firebog.net/hosts/Prigent-Malware.txt
Ignored: https://mirror.cedia.org.ec/malwaredomains/immortal_domains.txt
Ignored: https://www.malwaredomainlist.com/hostslist/hosts.txt
Ignored: https://bitbucket.org/ethanr/dns-blacklists/raw/8575c9f96e5b4a1308f2f12394abd86d0927a4a0/bad_lists/Mandiant_APT1_Report_Appendix_D.txt
Ignored: https://phishing.army/download/phishing_army_blocklist_extended.txt
Ignored: https://gitlab.com/quidsup/notrack-blocklists/raw/master/notrack-malware.txt
Ignored: https://v.firebog.net/hosts/Shalla-mal.txt
Ignored: https://raw.githubusercontent.com/Spam404/lists/master/main-blacklist.txt
Ignored: https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts
Ignored: https://urlhaus.abuse.ch/downloads/hostfile/
Ignored: https://zerodot1.gitlab.io/CoinBlockerLists/hosts_browser

SQL Query: SELECT * FROM `domainlist`
Fetching WHITELIST from 'https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/referral-sites.txt'... failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request

SQL Query: SELECT id FROM `domainlist` WHERE `comment` LIKE "%Managed by pihole-updatelists%" AND `enabled` = 1 AND `type` = 2 LIMIT 1
Fetching BLACKLIST from 'https://dbl.oisd.nl/ https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains https://sysctl.org/cameleon/hosts https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt https://www.github.developerdan.com/hosts/lists/facebook-extended.txt https://zerodot1.gitlab.io/CoinBlockerLists/hosts'... failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request

Fetching REGEX_BLACKLIST from 'https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list'... done (15 entries)
Processing...
SQL Query: SELECT * FROM `domainlist` WHERE `enabled` = 1 AND `type` = 3 AND `comment` LIKE "%Managed by pihole-updatelists%"
Ignored: ^(.+[_.-])?ad([sxv]?[0-9]*|system)[_.-]
Ignored: ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]
Ignored: ^(.+[_.-])?telemetry[_.-]
Ignored: ^adim(age|g)s?[0-9]*[_.-]
Ignored: ^adtrack(er|ing)?[0-9]*[_.-]
Ignored: ^advert(s|is(ing|ements?))?[0-9]*[_.-]
Ignored: ^aff(iliat(es?|ion))?[_.-]
Ignored: ^analytics?[_.-]
Ignored: ^banners?[_.-]
Ignored: ^beacons?[0-9]*[_.-]
Ignored: ^count(ers?)?[0-9]*[_.-]
Ignored: ^mads\.
Ignored: ^pixels?[-.]
Ignored: ^stat(s|istics)?[0-9]*[_.-]
Ignored: ^track(ing)?[0-9]*[_.-]

Sending reload signal to Pi-hole's DNS server... done

Memory reached peak usage of 743.24 KB
Finished with 2 error(s) in 0.96 seconds.
Releasing lock and removing lockfile: /var/lock/pihole-updatelists.lock
jacklul commented 4 years ago

Found the bug, fixed now Update and it should be working

biship commented 4 years ago

New lines worked. However, it wont import https://dbl.oisd.nl/ due to the leading 0.0.0.0

; Remote list URL containing exact domains to blacklist
BLACKLIST_URL="https://dbl.oisd.nl/
https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
https://mirror1.malwaredomains.com/files/justdomains
https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block
https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt
https://www.github.developerdan.com/hosts/lists/facebook-extended.txt
https://zerodot1.gitlab.io/CoinBlockerLists/hosts"
SQL Query: SELECT id FROM `domainlist` WHERE `comment` LIKE "%Managed by pihole-updatelists%" AND `enabled` = 1 AND `type` = 2 LIMIT 1
Fetching BLACKLIST from 'https://dbl.oisd.nl/'... done
Fetching BLACKLIST from 'https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts'... done
Fetching BLACKLIST from 'https://mirror1.malwaredomains.com/files/justdomains'... done
Fetching BLACKLIST from 'https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt'... done
Fetching BLACKLIST from 'https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt'... done
Fetching BLACKLIST from 'https://raw.githubusercontent.com/llacb47/mischosts/master/social/tiktok-block'... done
Fetching BLACKLIST from 'https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt'... done
Fetching BLACKLIST from 'https://www.github.developerdan.com/hosts/lists/facebook-extended.txt'... done
Fetching BLACKLIST from 'https://zerodot1.gitlab.io/CoinBlockerLists/hosts'... done
Merging multiple lists... done (1138488 entries)
Processing...
SQL Query: SELECT * FROM `domainlist` WHERE `enabled` = 1 AND `type` = 1 AND `comment` LIKE "%Managed by pihole-updatelists%"
Invalid: 0.0.0.0 0--e.info
Invalid: 0.0.0.0 0-0-----------------------------------------------------------0.com
Invalid: 0.0.0.0 0-0-0-0-0-0-0-0-0-0-0-0-0-18-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Invalid: 0.0.0.0 0-0-0-0-0-0-0-0-0-0-0-0-0-33-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Invalid: 0.0.0.0 0-0-0-0go.sxn.us
Invalid: 0.0.0.0 0-168.com
Invalid: 0.0.0.0 0-800-email.com

It imports from the GUI fine:

  [i] Target: https://dbl.oisd.nl/
  [✓] Status: Retrieval successful
  [i] Received 938774 domains
jacklul commented 4 years ago

Because those are adlists you're pasting into BLACKLIST_URL. BLACKLIST_URL is for user-blacklist, one domain per line.

You're misconfiguring the script big time and messing up your Pi-hole.

ADLISTS_URL should contain list of remote files that contain a list of adlists, you do not insert individual adlists here (insert them directly into Pi-hole's adlist instead).

biship commented 4 years ago

https://hosts.oisd.nl/ contains a list of domains, one per line. So by your definition its a blacklist, and belongs in "BLACKLIST_URL"

In the pi-hole GUI, it lives in the "Adlist group management", and works fine. It retrieves all 938774 domains.

It doesn't work when placed in "ADLISTS_URL" because its not a list of lists (like firebog.net is). It errors when placed in "BLACKLIST_URL" as shown above.

You can even see you r script successfully downloads all 9 adlists without issue and merges them: Merging multiple lists... done (1138488 entries) It just can't insert them into gravity.

jacklul commented 4 years ago

https://hosts.oisd.nl/ contains a list of domains, one per line.

But it is actually an adlist, Pi-hole accepts formats with one domain per line as well. BLACKLIST_URL is mostly for user stuff or when syncing multiple Pi's.

It is of course up to you how you want to handle it but it will greatly impact performance of your Pi-hole if you put it as blacklist (it will insert over 900,000 entries as blacklist without creating optimized indexed tree).

It just can't insert them into gravity.

My script doesn't touch gravity, it's generated by Pi-hole only from adlists. Blacklist and whitelist doesn't affect gravity.

biship commented 4 years ago

ok I apologize for my misunderstanding, but clearly I do not understand how to use pi-hole and/or your script correctly. I have those lists shown above in my /etc/pihole-updatelists.conf.

Assuming all those lists contain what I (me) want to block. How do I put them into your /etc/pihole-updatelists.conf so that they are used correctly and optimally?

I plan to re-enable the gravity update in your script as the final step after all lists are downloaded & merged.

jacklul commented 4 years ago

ADLISTS_URL - must contain addresses to files containing LIST OF ADLISTS Example: https://v.firebog.net/hosts/lists.php?type=tick

WHITELIST_URL - must contain addresses to files containing domains to whitelist (one domain per line) Example: https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt

REGEX_BLACKLIST_URL - must contain addresses to files containing REGEX rules to blacklist (one rule per line) Example: https://raw.githubusercontent.com/mmotti/pihole-regex/master/regex.list

REGEX_WHITELIST_URL - like REGEX_BLACKLIST_URL but for whitelist BLACKLIST_URL - like WHITELIST_URL but for blacklist (not adlist!)

WHITELIST and BLACKLIST are for your custom entries which you want to whitelist/blacklist.

If you need to add ADLISTs by hand do it in Pi-hole's web gui.