AdguardTeam / AdGuardHome

Network-wide ads & trackers blocking DNS server
https://adguard.com/adguard-home.html
GNU General Public License v3.0
23.57k stars 1.73k forks source link

When there are multiple upstream servers, the memory of the adguardhome will gradually increase until it goes down #6438

Closed 6aink closed 7 months ago

6aink commented 7 months ago

Prerequisites

Platform (OS and CPU architecture)

Linux, AMD64 (aka x86_64)

Installation

GitHub releases or script from README

Setup

On one machine

AdGuard Home version

v0.108.0-a.762+388583ce

Action

When there are multiple upstream servers, the memory of the adguardhome will gradually increase until it goes down

Expected result

Memory no longer rises and can be used normally

Actual result

memory explosion

Additional information and/or screenshots

No response

ainar-g commented 7 months ago

Unfortunately, we cannot reproduce that. Can you provide the upstreams you're using?

6aink commented 7 months ago

You can try replicating it this way by turning off the cache and setting it to any number of upstream streams, with a request rate of 1000qps. At this point, you will see significant memory growth and the increased memory will not be reclaimed

6aink commented 7 months ago

hello?

ainar-g commented 7 months ago

We do not see this behaviour in our tests. The only thing that's growing are things like query log and statistics buffers and safe browsing/parental control hash caches.

If you do feel like there could be a resource leak, please collect the memory profile from your AGH by setting http.pprof.enabled to true and saving the contents of http://127.0.0.1:6060/debug/pprof/heap?debug=1.

6aink commented 7 months ago

runtime.MemStats

Alloc = 216394872

TotalAlloc = 2811495504

Sys = 782182670

Lookups = 0

Mallocs = 39648191

Frees = 36940462

HeapAlloc = 216394872

HeapSys = 388268032

HeapIdle = 90390528

HeapInuse = 297877504

HeapReleased = 2097152

HeapObjects = 2707729

Stack = 358318080 / 358318080

MSpan = 7188160 / 8257920

MCache = 1200 / 15600

BuckHashSys = 1837175

GCSys = 24128040

OtherSys = 1357823

NextGC = 391167648

LastGC = 1700676298283323144

PauseNs = [39748 27967 49049 28657 29758 31672 42011 38645 30012 30797 32892 32828 43832 38935 35057 29200 41892 42477 63468 39886 39858 39680 47910 29063 49833 71065 91834 107400 103858 110142 71937 110469 95963 79019 80760 91158 80193 66975 83620 78068 67919 96443 91057 104751 93846 89019 113233 72543 124203 92168 115352 116745 134579 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

PauseEnd = [1700675796695587184 1700675796747277915 1700675796787929027 1700675796807653751 1700675796828772414 1700675796850749277 1700675796874240303 1700675796907065875 1700675796931342799 1700675796958458136 1700675796989055779 1700675797018145234 1700675797058403381 1700675797101146541 1700675797150527293 1700675797192579596 1700675797257559350 1700675797318717358 1700675797374262774 1700675797421832348 1700675797522881587 1700675797604555314 1700675797685194817 1700675797739868543 1700675803808033752 1700675815367222304 1700675820705851013 1700675826505116475 1700675832811396055 1700675836688465098 1700675844812044060 1700675854064004860 1700675860470253690 1700675866781841784 1700675878274867695 1700675891158600218 1700675903230067341 1700675925705021025 1700675948758136936 1700675969923093890 1700675998424343989 1700676017819147238 1700676046622744399 1700676073065595547 1700676086074561145 1700676100099244996 1700676117641973717 1700676145744206401 1700676162744640983 1700676191590924867 1700676223859415298 1700676268796767167 1700676298283323144 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

NumGC = 53

NumForcedGC = 1

GCCPUFraction = 0.004390960363539982

DebugGC = false

MaxRSS = 795369472

6aink commented 7 months ago

please help me

6aink commented 7 months ago

runtime.MemStats

Alloc = 333695192

TotalAlloc = 3523309048

Sys = 967740718

Lookups = 0

Mallocs = 50305975

Frees = 45773050

HeapAlloc = 333695192

HeapSys = 452755456

HeapIdle = 79405056

HeapInuse = 373350400

HeapReleased = 2097152

HeapObjects = 4532925

Stack = 474185728 / 474185728

MSpan = 9128480 / 10134720

MCache = 1200 / 15600

BuckHashSys = 1869007

GCSys = 27065360

OtherSys = 1714847

NextGC = 454728960

LastGC = 1700676458916264130

PauseNs = [39748 27967 49049 28657 29758 31672 42011 38645 30012 30797 32892 32828 43832 38935 35057 29200 41892 42477 63468 39886 39858 39680 47910 29063 49833 71065 91834 107400 103858 110142 71937 110469 95963 79019 80760 91158 80193 66975 83620 78068 67919 96443 91057 104751 93846 89019 113233 72543 124203 92168 115352 116745 134579 80615 64255 97551 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

PauseEnd = [1700675796695587184 1700675796747277915 1700675796787929027 1700675796807653751 1700675796828772414 1700675796850749277 1700675796874240303 1700675796907065875 1700675796931342799 1700675796958458136 1700675796989055779 1700675797018145234 1700675797058403381 1700675797101146541 1700675797150527293 1700675797192579596 1700675797257559350 1700675797318717358 1700675797374262774 1700675797421832348 1700675797522881587 1700675797604555314 1700675797685194817 1700675797739868543 1700675803808033752 1700675815367222304 1700675820705851013 1700675826505116475 1700675832811396055 1700675836688465098 1700675844812044060 1700675854064004860 1700675860470253690 1700675866781841784 1700675878274867695 1700675891158600218 1700675903230067341 1700675925705021025 1700675948758136936 1700675969923093890 1700675998424343989 1700676017819147238 1700676046622744399 1700676073065595547 1700676086074561145 1700676100099244996 1700676117641973717 1700676145744206401 1700676162744640983 1700676191590924867 1700676223859415298 1700676268796767167 1700676298283323144 1700676341853772182 1700676380939222009 1700676458916264130 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

NumGC = 56

NumForcedGC = 1

GCCPUFraction = 0.004278864523484097

DebugGC = false

MaxRSS = 982650880

6aink commented 7 months ago

https://raw.githubusercontent.com/6aink/render/master/debuglog.txt

6aink commented 7 months ago

this is debug log file

6aink commented 7 months ago

Memory leaks occur when parallel requests are used.

6aink commented 7 months ago

Can you help me? Thanks very much

duckxx commented 7 months ago

这种情况出现在v0.108.0-b.50,有可能不是多个上游导致的,更新后,服务器4G内存在半个小时内只剩下10%的可用,CPU涨到了50%。并且只是更新,没有修改任何数据。

6aink commented 7 months ago

请问有什么办法避免这个问题吗

duckxx commented 7 months ago

请问有什么办法避免这个问题吗

在v0.108.0-b.49或以下版本未出现这种异常。可以尝试使用v0.108.0-b.49或以下版本。

6aink commented 7 months ago

I didn't find a memory leak with a single upstream. I suspect it has something to do with their new upstream feature in b50. Hopefully the adguardhome team can fix this memory leak.

ainar-g commented 7 months ago

Thank you for the profiles, we've been able to reproduce the goroutine leak.

@EugeneOne1, this is reproduceable using e.g. godnsbench -a udp://127.0.0.1:53 -c 100000 -q '{random}.example.com' with no cache and parallel requests for plain UDP upstreams. Please fix asap.

6aink commented 7 months ago

Okay, thank you very much.

EugeneOne1 commented 7 months ago

@6aink, hello again. We've pushed an edge build that should fix this leak. Could you please try it out?

6aink commented 7 months ago

OK, wait a minute.

6aink commented 7 months ago

I probably observed for ten minutes, the memory has no upward trend, the problem has been solved, thank you.

6aink commented 7 months ago

I see that the memory is rising again, but not very fast. I will try to watch it for another hour.

6aink commented 7 months ago

I tested it for a few hours, and the memory stabilized at 300m, and there was no problem.

EugeneOne1 commented 7 months ago

@6aink, thanks for testing! We'll include this to the following beta release.

Adgbeta commented 7 months ago

Edge build looks like fixed the issue so far, testing for over 4 hours, no increase of memory use.

04bondgoods commented 6 months ago

seems this issue still persists in 108.b51 version. I encountered:

  1. agh memory cost grew to ~ 2G (~50% of total mem of 4G) after roughly 1 month
  2. after restarting agh 1 day ago, memory has grown to ~ 450MB (~12% of total memory) now
  3. memory usage should be ~ 60MB normally