mokeyish / smartdns-rs

A cross platform local DNS server (Dnsmasq like) written in rust to obtain the fastest website IP for the best Internet experience, supports DoT, DoQ, DoH, DoH3.
GNU General Public License v3.0
600 stars 42 forks source link

Windows 11上0.9.0版本运行时间长后会查询超时 #431

Open debugg-a opened 3 weeks ago

debugg-a commented 3 weeks ago

OS:Windows 11 23H2 smardns版本:0.9.0 现象:在系统不关机持续运行一段时间后,出现DNS查询超时 image 打印的日志如下: thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant note: run with RUST_BACKTRACE=1 environment variable to display a backtrace thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33: overflow when subtracting duration from instant thread 'smartdns-runtime' panicked at library\std\src\time.rs:433:33:

mokeyish commented 3 weeks ago

有没有能复现的最下配置,可能上游服务器,比如近期的阿里的限流,超时等原因导致的。

debugg-a commented 3 weeks ago
# 配置 bootstrap-dns,如不配置则调用系统的
server https://1.12.12.12/dns-query -bootstrap-dns -exclude-default-group
# 配置国内上游服务器
# TencentDNS
server https://doh.pub/dns-query -group doh-tencent -exclude-default-group
# AliDNS
server https://dns.alidns.com/dns-query -group doh-alidns -exclude-default-group
# 360DNS
server https://doh.360.cn/dns-query -group doh-cn -exclude-default-group
# OneDNS
server https://doh-pure.onedns.net/dns-query -group doh-cn -exclude-default-group
# 配置国外上游服务器
server tls://9.9.9.9
server tls://1.0.0.1
# 配置国内的域名国内DNS解析
nameserver /domain-set:set-alibaba/doh-alidns
nameserver /domain-set:set-tencent/doh-tencent
nameserver /domain-set:set-cn/doh-cn
xshzr commented 3 weeks ago

同样0.9.0,windows10遇到同样问题。 屏948 smartdns刚启动时查询没问题,过几分钟到十几分钟(没有准确的数值但是很短。),再查询就开始超时。 服务里单纯重启smartdns,无效。 关闭Smartdns,删除硬盘里的缓存文件,再启动Smartdns。又正常了。然后过一会还会复现。 怀疑和缓存有关。

mokeyish commented 3 weeks ago

@xshzr 你禁用缓存看看,试一段时间看看?如果是的话,我着重检查这块代码。

我放我自己的 windows 上跑,也改了日志打印,观察,6 个小时都没出现这种问题,可能电脑核心数太多了😅

xshzr commented 3 weeks ago

设置了cache-size 0后,还别说,一下午都没有出现超时。。

mokeyish commented 3 weeks ago

设置了cache-size 0后,还别说,一下午都没有出现超时。。

那开启缓存,禁掉域名预读取呢?

xshzr commented 3 weeks ago

之前开启缓存时的相关配置,预读取已经是关闭的: cache-size 30000 cache-file O:/smartdns/cache/dnscache.txt cache-persist yes cache-checkpoint-time 10800 prefetch-domain no serve-expired yes serve-expired-ttl 10000 serve-expired-reply-ttl 3

serve-expired-prefetch-time 43200

rr-ttl-min 300

mokeyish commented 3 weeks ago

那可能是 messageType 错了,从缓存读取是要根据 DNS记录重建 Message 的,昨天看到这个错误顺便改了,然后我自己一边一直监控着,哪怕改成单个线程也没有这种问题(但这里说的是关于时间,超时的问题,也不太像 messageType 导致的)。

晚点我提交这个修复看看。

mokeyish commented 3 weeks ago

433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

xshzr commented 3 weeks ago

#433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

修复了,此贴以上提到的问题,我这里没有再出现。

不过我还有一个老问题,好几个版本都一样,就是安卓手机端,设置dns为开启smartdns服务的电脑ip地址后,部分解析就会出问题。比如微信公众号文章可以打开,但里面的图片无法显示,咸鱼可以打开,但是咸鱼的签到页面一直显示页面加载失败,咸鱼的图片可以打开,视频无法播放。

debugg-a commented 3 weeks ago

433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

该版本开启缓存,开始预读,不到3效时开始超时。

cache-size 32768
cache-persist yes

image

debugg-a commented 3 weeks ago

433

下载 https://github.com/mokeyish/smartdns-rs/actions/runs/11762199344

该版本开启缓存,开始预读,不到3效时开始超时。

cache-size 32768
cache-persist yes

image

开启缓存 image

xshzr commented 3 weeks ago

昨天下午好像确实没有遇到之前超时的问题,我也觉的好了,晚上关机了,今天开机后那个mp.weixin.qq.com又超时了,重启smartdns服务也没用。我的缓存是cache-checkpoint-time定期保存到硬盘上的。不知道是不是第二天读取硬盘的缓存出现什么问题?

这是今天重启smartdns后的一段日志:

2024-11-11 11:55:05.973:INFO: ____ 2024-11-11 11:55:05.973:INFO: / __| | | | | \ | |/ __| 2024-11-11 11:55:05.973:INFO: | (__ _ | |_ | | | | | | (__
2024-11-11 11:55:05.973:INFO: \
| ' ` \ / _| '__| __| | | | | . |_ \ 2024-11-11 11:55:05.973:INFO: __) | | | | | | (| | | | | | || | |\ |__) | 2024-11-11 11:55:05.973:INFO: |/|| || ||_,|| _| |_/|| _|____/ 2024-11-11 11:55:05.973:INFO: 2024-11-11 11:55:05.973:INFO: awaiting connections... 2024-11-11 11:55:05.973:INFO: server starting up 2024-11-11 11:55:06.181:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204 2024-11-11 11:55:06.182:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204 2024-11-11 11:55:06.182:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58204 2024-11-11 11:55:07.317:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58356 2024-11-11 11:55:08.82:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655 2024-11-11 11:55:08.83:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655 2024-11-11 11:55:08.83:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655 2024-11-11 11:55:08.84:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655 2024-11-11 11:55:08.84:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:58655 2024-11-11 11:55:09.320:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58357 2024-11-11 11:55:09.320:DEBUG:smartdns::app:363: Request: 11 src:udp://192.168.88.246#58357 type:QUERY dnssec:false QUERY:mp.weixin.qq.com.:AAAA:IN qflags:RD 2024-11-11 11:55:09.320:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0 ; query ;; mp.weixin.qq.com. IN AAAA ; answers 1 mp.weixin.qq.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400 ; nameservers 0 ; additionals 0 , Duration: 40µs 2024-11-11 11:55:09.321:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58358 2024-11-11 11:55:10.614:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628 2024-11-11 11:55:10.614:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628 2024-11-11 11:55:10.615:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628 2024-11-11 11:55:10.615:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628 2024-11-11 11:55:10.616:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.198:59628 2024-11-11 11:55:11.335:DEBUG:smartdns::server::udp:45: received udp request from: 192.168.88.246:58359 2024-11-11 11:55:11.335:DEBUG:smartdns::app:363: Request: 13 src:udp://192.168.88.246#58359 type:QUERY dnssec:false QUERY:mp.weixin.qq.com.:AAAA:IN qflags:RD 2024-11-11 11:55:11.335:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0 ; query ;; mp.weixin.qq.com. IN AAAA ; answers 1 mp.weixin.qq.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400 ; nameservers 0 ; additionals 0 , Duration: 33.5µs

mokeyish commented 3 weeks ago

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)
debugg-a commented 3 weeks ago

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)

image

禁掉预读取,开启缓存------出现超时

debugg-a commented 3 weeks ago

我这不好复现,帮忙定位下是什么问题。

  1. 缓存预读取(禁掉预读取,开启缓存)
  2. 缓存(禁掉缓存)
  3. 测速(启用 fastest-response,这是最快响应,相当于不测速)

image

禁掉预读取,开启缓存------出现超时

禁掉预读取,开启缓存,启用 fastest-response------出现超时 image

mokeyish commented 3 weeks ago

再做个测试,把 server 带域名的,给它指定一个 ip, 比如:

server https://cloudflare-dns.com/dns-query?ip=1.1.1.1

看看还会不会超时

debugg-a commented 2 weeks ago

再做个测试,把 server 带域名的,给它指定一个 ip, 比如:

server https://cloudflare-dns.com/dns-query?ip=1.1.1.1

看看还会不会超时

image

快一个小时了,还没有出现超时。

//更新,一天了,到目前为止没有出现超时 image

xshzr commented 2 weeks ago

说下我的一个观察到的规律,我的使用环境是把smartdns设在个人电脑上的,电脑每天都要关机的。

每次电脑开机后smartdns就会出现查询超时现象,一开始我以为是上次关机前保存在硬盘里的catch有问题,所以关闭smartdns,删除catch,启动smartdns,当时能查询了,但是,几分钟后查询过的域名就会再次超时,直到过期缓存超时时间serve-expired-ttl 被触发,一切都正常了。再也不超时了。serve-expired-ttl值设置5分钟,就5分钟后正常,1小时,就一小时后正常。

然后发现即使不关闭电脑,重启smartdns服务,又会再次出现上面出现的超时情况。并且重启过程是否删除硬盘缓存,并不会有大区别,唯一的区别就是删缓存,则第一次查询成功然后几分钟后超时,不删缓存,则直接超时。

并且超时后日志里只有AAAA的反馈,而没有A的反馈,即使指定-type=A也一样 2024-11-12 10:36:38.563:DEBUG:smartdns::app:363: Request: 11 src:udp://192.168.88.246#55040 type:QUERY dnssec:false QUERY:www.msn.com.:AAAA:IN qflags:RD 2024-11-12 10:36:38.563:DEBUG:smartdns::app:374: Response: ; header 0:RESPONSE::NoError:QUERY:1/0/0 ; query ;; www.msn.com. IN AAAA ; answers 1 www.msn.com. 300 IN SOA a.gtld-servers.net nstld.verisign-grs.com 1800 1800 900 604800 86400 0722

以上是在没有给“server 带域名的,都给它指定一个 ip”情况下观察的。

mokeyish commented 2 days ago

我做了一个 DNS 性能测试的仓库。

https://github.com/mokeyish/dnsperf-testing

对本仓库依赖 hickory 进行测试,发现大量50%超时,然后老外测试,居然不能重现。你们看看,你们能重现么?依赖的测试工具 windows 可能没有,得装 WSL 。

下面是提交的 issue ,有说测试步骤。思路是直接对 119.29.29.29 进行测试,与通过 hickory 做转发 119.29.29.29 对比。

https://github.com/hickory-dns/hickory-dns/issues/2613

mokeyish commented 2 days ago

@xshzr 你说的规律,相当于控缓存了。现在发现可能是底层库 hickory 的问题,得先解决 hickory 超时问题,因为 smartdns-rs 依赖了它的 proto 协议库。