Open lluu131 opened 10 months ago
2 hours later
@lluu131, hello and thanks for the thorough report. Unfortunately, we can't reproduce the leak. It would really help us to troubleshoot this issue if you could collect a goroutines profile for us.
To perform that, restart the dnsproxy
service with profiling enabled. To enable it, use the --pprof
CLI option, or set pprof: true
in the YAML configuration file. When the memory grows to the suspicious level again, use the following command:
curl "http://127.0.0.1:6060/debug/pprof/goroutine?debug=1" > profile.txt
Or just follow the "http://127.0.0.1:6060/debug/pprof/goroutine?debug=1" URL with your web browser.
Note that profiles could only be accessed on the same host machine.
You can send the resulting profile to our devteam@adguard.com.
all-servers: yes
Memory is fine after commenting out fastest-addr
Debug re-collected, just found out that memory increases massively when the quic server network is unreachable, but doesn't free up or diminish when it recovers
@EugeneOne1 EugeneOne1 Profile.txt has been sent by e-mail
@lluu131, hello and thanks for the thorough report. Unfortunately, we can't reproduce the leak. It would really help us to troubleshoot this issue if you could collect a goroutines profile for us.
To perform that, restart the
dnsproxy
service with profiling enabled. To enable it, use the--pprof
CLI option, or setpprof: true
in the YAML configuration file. When the memory grows to the suspicious level again, use the following command:curl "http://127.0.0.1:6060/debug/pprof/goroutine?debug=1" > profile.txt
Or just follow the "http://127.0.0.1:6060/debug/pprof/goroutine?debug=1" URL with your web browser.
Note that profiles could only be accessed on the same host machine.
You can send the resulting profile to our devteam@adguard.com.
Profile.txt has been sent by e-mail
@lluu131, hello again. Thank you for your help, the profile clarified the issue for us. We've pushed the patch (v0.63.1
) that may improve the situation. Could you please check if it does?
If the issue persists, wouldn't you mind to collect the profile again? We'd also like to take a look at the verbose log (verbose: true
in YAML configuration) if it's possible to collect it.
@lluu131, hello again. Thank you for your help, the profile clarified the issue for us. We've pushed the patch (
v0.63.1
) that may improve the situation. Could you please check if it does?If the issue persists, wouldn't you mind to collect the profile again? We'd also like to take a look at the verbose log (
verbose: true
in YAML configuration) if it's possible to collect it.
Already done with client and server updates, I noticed from verbose that the client is requesting root dns every second, is this normal??
@lluu131, hello again. Thank you for your help, the profile clarified the issue for us. We've pushed the patch (
v0.63.1
) that may improve the situation. Could you please check if it does?If the issue persists, wouldn't you mind to collect the profile again? We'd also like to take a look at the verbose log (
verbose: true
in YAML configuration) if it's possible to collect it.
Tested for a few hours, memory increases after quic upstream interruptions, memory stops increasing after upstream resumes (but won't be freed), some improvement compared to the previous constant increase, but there is still a problem, the relevant logs were sent via email
It looks worse.
@lluu131, we've received the data. Thank you for your help.
@lluu131, we've been investigating some unusual concurrency patterns used in the DNS-over-QUIC code, and found that the dependency responsible for handling QUIC protocol probably contains the bug (quic-go/quic-go#4303). Anyway, we should come up with some workaround in the meantime.
cost 10G after running 66 day
this machine only run all my dns server.
the config is :
[Unit]
Description=dnsproxy Service
Requires=network.target
After=network.target
[Service]
Type=simple
User=jeremie
Restart=always
AmbientCapabilities=CAP_NET_BIND_SERVICE
ExecStart=/usr/bin/dnsproxy -l 0.0.0.0 -p 5353 \
--all-servers \
-f tls://1.1.1.1 \
-u sdns://AgcAAAAAAAAABzEuMC4wLjGgENk8mGSlIfMGXMOlIlCcKvq7AVgcrZxtjon911-ep0cg63Ul-I8NlFj4GplQGb_TTLiczclX57DvMV8Q-JdjgRgSZG5zLmNsb3VkZmxhcmUuY29tCi9kbnMtcXVlcnk \
-f https://1.1.1.1/dns-query \
-u https://1.0.0.1/dns-query \
-u https://dns.google/dns-query \
-u https://1.0.0.1/dns-query \
-u https://mozilla.cloudflare-dns.com/dns-query \
-u https://dns11.quad9.net/dns-query \
-u https://dns10.quad9.net/dns-query \
-u https://dns.quad9.net/dns-query \
--http3 \
--bootstrap=1.0.0.1:53
[Install]
WantedBy=multi-user.target
@EugeneOne1
I've observed a memory leak issue in my home environment. I am using DoH. When I configured a wrong DoH API URL, the system reported an out of memory error of the dnsproxy process.
I am using the docker version of adguard/dnsproxy.
Update:There are many query errors in my log. It seems that when an upstream query error occurs (such as the network is temporarily unavailable), the memory will increase until out of memory.
QUIC upstream
UDP upstream
For the same configuration, the memory footprint of QUIC upstream is very high and constantly increasing, but UDP upstream is very low, both without caching