Ouranosinc / Magpie

AuthN/AuthZ services
https://pavics-magpie.readthedocs.io
Apache License 2.0
1 stars 5 forks source link

[BUG] Possible memory leak with Twitcher/Magpie #505

Open tlvu opened 2 years ago

tlvu commented 2 years ago

Describe the bug

Twitcher is constantly taking lots of Memory and Cpu.

$ docker logs twitcher thredds
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS    
4612ca2a1be0        twitcher            87.18%              9.385GiB / 125.8GiB   7.46%               1.04TB / 1.04TB     0B / 0B             51
81aaf9dcbdc8        thredds             71.79%              7.159GiB / 125.8GiB   5.69%               3.45GB / 519GB      0B / 0B             92

We have had a lot of Thredds activity lately so it's normal for the Cpu and Memory consumption to increase for Thredds since it has a caching feature.

But Twitcher saw a proportionally increase which is puzzling to me and stay like this during idle time.

To Reproduce Steps to reproduce the behavior:

  1. PAVICS deployed at this commit from pour production fork of birdhouse-deploy https://github.com/Ouranosinc/birdhouse-deploy/commit/76dd3c86e57d6a96ca84d2124f3b019fe278645a which triple Thredds memory.

Expected behavior Twitcher Cpu and Memory would increase during Thredds transfer/activity but should go down during idle time.

Desktop (please complete the following information):

fmigneault commented 2 years ago

Should probably be moved to https://github.com/bird-house/twitcher/ repo. I don't think this has anything to do with Magpie.

tlvu commented 2 years ago

Opened corresponding bugs on Twitcher side https://github.com/bird-house/twitcher/issues/113.

I originally open here since the Magpie adapter is known to possibly can have impact on Twitcher performance, as seen with the caching feature.

fmigneault commented 2 years ago

@tlvu Maybe worth a shot to investigate if this is caused by caching requests. If the duration of caching is adjusted to a lower value or is disabled entirely, a significant drop in memory could indicate that cached responses remain active although out-of-date, until the next request invalidates them. (https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/config/twitcher/twitcher.ini.template#L44-L45)

For the CPU use though, I have no idea what could be the cause.

tlvu commented 2 years ago

Maybe worth a shot to investigate if this is caused by caching requests.

On Ouranos production, we still have the caching feature disabled.

FYI, you can track all config diff between default birdhouse-deploy and our production deployment by doing the diff between the repos: https://github.com/bird-house/birdhouse-deploy/compare/master...Ouranosinc:master

fmigneault commented 2 years ago

In this case, I further believe the issue is on Twitcher side. Once access verification is obtained from Magpie, it returns to let Twitcher handle the request by itself.