0xERR0R / blocky

Fast and lightweight DNS proxy as ad-blocker for local network with many features
Apache License 2.0
4.7k stars 208 forks source link

The parsing of a blocking list containing errors seems block the process #1130

Closed sylvek closed 1 year ago

sylvek commented 1 year ago

Periodically, i have some unexpected websites available. I tried to find the origin of this issue and I did some tests.

my configuration is pretty correct and i source this link below in order to filter what i allow at home. https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts

basically pornhub.com and linkedin.com should return when i "dig" them and everything works fine the most part of the time.

sadly some time, i can reach them from my computer and after testing when this issue happens, i'm sure to request my personal instance of blocky (i was wondering if for some reason i could request an another one). Anyway, i set the level log to debug, restarted my blocky instance and saw some errors

[2023-09-07 10:25:42]  WARN config option "upstream" is deprecated, please use "upstreams.groups" instead
[2023-09-07 10:25:42] ERROR configuration uses deprecated options, see warning logs for details
[2023-09-07 10:25:42]  INFO _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
[2023-09-07 10:25:42]  INFO _/                                                              _/
[2023-09-07 10:25:42]  INFO _/                                                              _/
[2023-09-07 10:25:42]  INFO _/       _/        _/                      _/                   _/
[2023-09-07 10:25:42]  INFO _/      _/_/_/    _/    _/_/      _/_/_/  _/  _/    _/    _/    _/
[2023-09-07 10:25:42]  INFO _/     _/    _/  _/  _/    _/  _/        _/_/      _/    _/     _/
[2023-09-07 10:25:42]  INFO _/    _/    _/  _/  _/    _/  _/        _/  _/    _/    _/      _/
[2023-09-07 10:25:42]  INFO _/   _/_/_/    _/    _/_/      _/_/_/  _/    _/    _/_/_/       _/
[2023-09-07 10:25:42]  INFO _/                                                    _/        _/
[2023-09-07 10:25:42]  INFO _/                                               _/_/           _/
[2023-09-07 10:25:42]  INFO _/                                                              _/
[2023-09-07 10:25:42]  INFO _/                                                              _/
[2023-09-07 10:25:42]  INFO _/  Version: v0.22              Build time: 20230830-193615     _/
[2023-09-07 10:25:42]  INFO _/                                                              _/
[2023-09-07 10:25:42]  INFO _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
[2023-09-07 10:25:42]  WARN config option "upstream" is deprecated, please use "upstreams.groups" instead
[2023-09-07 10:25:42] ERROR configuration uses deprecated options, see warning logs for details
[2023-09-07 10:25:42]  INFO bootstrap: bootstrapDns is not configured, will use system resolver
[2023-09-07 10:25:42] DEBUG list_cache: starting processing of source count=0 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268509: 2 errors occurred:
    * unexpected second column: archive.ph/gKMcR
    * invalid domain name: archive.ph/gKMcR

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268510: 2 errors occurred:
    * unexpected second column: archive.ph/goUZE
    * invalid domain name: archive.ph/goUZE

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268511: 2 errors occurred:
    * unexpected second column: archive.ph/nWwIO
    * invalid domain name: archive.ph/nWwIO

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268512: 2 errors occurred:
    * unexpected second column: archive.ph/RHJht
    * invalid domain name: archive.ph/RHJht

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268513: 2 errors occurred:
    * unexpected second column: archive.ph/tYeYc
    * invalid domain name: archive.ph/tYeYc

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  WARN list_cache: parse error: line 268514: 2 errors occurred:
    * unexpected second column: archive.vn/6hOf5
    * invalid domain name: archive.vn/6hOf5

, trying to continue count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51] ERROR list_cache: parse error: line 268514: too many parse errors count=261732 source=https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  INFO list_cache: group import finished group=forbidden total_count=261730

And when i do a dig, i have an answer for www.twitter.com

▶ dig www.twitter.com

; <<>> DiG 9.10.6 <<>> www.twitter.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3471
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 1472
;www.twitter.com.       IN  A

www.twitter.com.    574 IN  CNAME   twitter.com.
twitter.com.        1794    IN  A

;; Query time: 38 msec
;; WHEN: Thu Sep 07 12:28:10 CEST 2023
;; MSG SIZE  rcvd: 74

And nothing for an another website available on the downloaded list

▶ dig pornhub.com

; <<>> DiG 9.10.6 <<>> pornhub.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2199
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;pornhub.com.           IN  A

pornhub.com.        21600   IN  A

;; Query time: 40 msec
;; WHEN: Thu Sep 07 12:35:05 CEST 2023
;; MSG SIZE  rcvd: 45

So i was wondering if when i downloaded a new version of the file (every 4h), sometime the file is broken and instead of jumping to the next domain name, blocky could stop. For instance, i've downloaded the given file and i counted more lines than what Blocky said. (TOTAL: 261730 entries)

▶ wget https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
▶ wc -l hosts
  289642 hosts

Here, the rest of the log

[2023-09-07 10:25:51]  INFO server: current configuration:
[2023-09-07 10:25:51] DEBUG server: -> filtering: disabled
[2023-09-07 10:25:51] DEBUG server: -> fqdn_only: disabled
[2023-09-07 10:25:51] DEBUG server: -> client_names: disabled
[2023-09-07 10:25:51] DEBUG server: -> extended_error_code: disabled
[2023-09-07 10:25:51]  INFO server: -> query_logging:
[2023-09-07 10:25:51]  INFO server:      type: console
[2023-09-07 10:25:51]  INFO server:      logRetentionDays: 0
[2023-09-07 10:25:51] DEBUG server:      creationAttempts: 3
[2023-09-07 10:25:51] DEBUG server:      creationCooldown: 2 seconds
[2023-09-07 10:25:51]  INFO server:      fields: [clientIP clientName responseReason responseAnswer question duration]
[2023-09-07 10:25:51] DEBUG server: -> metrics: disabled
[2023-09-07 10:25:51] DEBUG server: -> custom_dns: disabled
[2023-09-07 10:25:51] DEBUG server: -> hosts_file: disabled
[2023-09-07 10:25:51]  INFO server: -> blocking:
[2023-09-07 10:25:51]  INFO server:      clientGroupsBlock:
[2023-09-07 10:25:51]  INFO server:        default = [forbidden]
[2023-09-07 10:25:51]  INFO server:      blockType = ZEROIP
[2023-09-07 10:25:51]  INFO server:      blockTTL = 6 hours
[2023-09-07 10:25:51]  INFO server:      loading:
[2023-09-07 10:25:51]  INFO server:        concurrency = 4
[2023-09-07 10:25:51] DEBUG server:        maxErrorsPerSource = 5
[2023-09-07 10:25:51] DEBUG server:        strategy = blocking
[2023-09-07 10:25:51]  INFO server:        refresh = every 4 hours
[2023-09-07 10:25:51]  INFO server:        downloads:
[2023-09-07 10:25:51]  INFO server:          timeout = 5 seconds
[2023-09-07 10:25:51]  INFO server:          attempts = 3
[2023-09-07 10:25:51] DEBUG server:          cooldown = 500 milliseconds
[2023-09-07 10:25:51]  INFO server:      blacklist:
[2023-09-07 10:25:51]  INFO server:        forbidden:
[2023-09-07 10:25:51]  INFO server:           - https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
[2023-09-07 10:25:51]  INFO server:      whitelist:
[2023-09-07 10:25:51]  INFO server:      blacklist cache entries:
[2023-09-07 10:25:51]  INFO server:        forbidden: 261730 entries
[2023-09-07 10:25:51]  INFO server:        TOTAL: 261730 entries
[2023-09-07 10:25:51]  INFO server:      whitelist cache entries:
[2023-09-07 10:25:51]  INFO server:        TOTAL: 0 entries
[2023-09-07 10:25:51] DEBUG server: -> caching: disabled
[2023-09-07 10:25:51] DEBUG server: -> conditional_upstream: disabled
[2023-09-07 10:25:51]  INFO server: -> special_use_domains:
[2023-09-07 10:25:51] DEBUG server:      rfc6762-appendixG = true
[2023-09-07 10:25:51]  INFO server: -> parallel_best:
[2023-09-07 10:25:51]  INFO server:      timeout: 0 seconds
[2023-09-07 10:25:51]  INFO server:      strategy: parallel_best
[2023-09-07 10:25:51]  INFO server:      groups:
[2023-09-07 10:25:51]  INFO server:        default:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server:          - tcp+udp:
[2023-09-07 10:25:51]  INFO server: listeners:
[2023-09-07 10:25:51]  INFO server:   DNS   = [53]
[2023-09-07 10:25:51]  INFO server:   TLS   = []
[2023-09-07 10:25:51]  INFO server:   HTTP  = [4000]
[2023-09-07 10:25:51]  INFO server:   HTTPS = []
[2023-09-07 10:25:51]  INFO server: runtime information:
[2023-09-07 10:25:51]  INFO server:   numCPU =       2
[2023-09-07 10:25:51]  INFO server:   numGoroutine = 11
[2023-09-07 10:25:51]  INFO server:   memory:
[2023-09-07 10:25:51]  INFO server:     alloc =                 6 MB
[2023-09-07 10:25:51]  INFO server:     heapAlloc =             6 MB
[2023-09-07 10:25:51]  INFO server:     sys =                  46 MB
[2023-09-07 10:25:51]  INFO server:     numGC =               253
[2023-09-07 10:25:51]  INFO server: Starting server
[2023-09-07 10:25:51]  INFO server: TCP server is up and running on address :53
[2023-09-07 10:25:51]  INFO server: http server is up and running on addr/port 4000
[2023-09-07 10:25:51]  INFO server: UDP server is up and running on address :53

My config.yml, if you need it

      - https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts
      - forbidden
  dns: 53
  http: 4000
  level: debug
sylvek commented 1 year ago

indeed the line with error is the 268509th element and www.twitter.com is the 289155th.

www.pornhub.com is the 263236th, what's why i can't reach it.

sylvek commented 1 year ago

maxErrorsPerSource: 5 should help 👍

sylvek commented 1 year ago

regarding the code & configuration, by playing with maxErrorsPerSource set to 5 by default, it could help bypass this issue.