Closed kresike closed 2 years ago
Hello @kresike can you put your caddy configuration please ? And if possible a reproductible minimal repository ?
Thank you!
@darkweak I compiled caddy using xcaddy, the command is in the original report, I have no separate repository for this. Also in the original report there is an attached bugreport.zip containing all the config files. Let me know if you need anything else. Thank you!
It could be the coalescing layer that cause this memory consumption/resident. I'll investigate on this way and try to reduce that. Maybe related to the ristretto memory leaks bugs.
@darkweak let me know if I can help in any way. Could you elaborate on "coalescing layer"? Maybe I could look at it myself, come up with somthing that might help.
The coalescing layer determine and store the requests that cannot be coalesced. If same multiple requests are sent to the server, only once will go to the backend and the same response will be used for all pending requests. But when you reload the configuration with caddy, this layer storage is not cleared so I'm working on it to call a reset method on it and use the caddy cleanup to trigger it on configuration reload.
I see. Let me know if you have a working version, I'll be happy to test it.
This commit https://github.com/darkweak/souin/pull/220/commits/553c1c75ca6a21b31c488472f78067903df122e3 should fix that, I tried to load more than 20 times the config and with that the memory seems to be stable.
You can now try with --with github.com/darkweak/souin/plugins/caddy
to get the memory leak fix.
Didn't have too much time to test this, but at first glance it looks great. Will do some more intensive testing and report back.
I've done some more tests. Seems that the memory leak issue is fixed. After more than 500 sites memory usage remained within a few megabytes of the original memory usage.
Now there is a linear slowdown in provisioning sites. At 10-20 sites the config is reloaded in 2-3 seconds on my desktop machine but at ~500 sites it takes more than 40 seconds to reload the configuration.
I've attached a profile I made. Seems like ristretto is clearing a lot of stuff, then the garbage collector runs a lot.
For each site, caddy will run the following workflow:
I thought about this some more, and this seems to work like this by design because whatever change is done in the caddy config, caddy starts a new instance internally and shuts down the old one after the new one is up. The original problem has been solved, so I'm closing the issue.
Thanks for fixing this @darkweak !
I'm trying to set up caddy as a caching reverse proxy for several sites and I will be using the config API to add, modify and remove sites as needed. Everything works as expected, except when I configure new sites by adding new routes to the first http server, after say 20 sites the resident set size is at aroung 3G which seems like a lot. If I download the whole config using the API and reload it into a freshly started caddy instance the memory usage remains normal at around 350MB, resident set size is just a bit higher then the used memory.
I've been trying to get more information using pprof, I have some data that might help narrow down the issue, but I think this part is a bit out of my league at this point.
I compiled caddy using the following command:
xcaddy build --with github.com/caddyserver/cache-handler --with github.com/corazawaf/coraza-caddy --with github.com/porech/caddy-maxmind-geolocation --with github.com/caddyserver/transform-encoder --with github.com/imgk/caddy-pprof
Caddy version is:
v2.5.1 h1:bAWwslD1jNeCzDa+jDCNwb8M3UJ2tPa8UZFFzPVmGKs=
The initial configuration is in the attached zip file, named empty_nosites.config
By configuring one site I mean (the onesite.config file is also in the attached zip):
curl -X POST -H "Content-Type: application/json" -d @onesite.config http://192.168.6.31:2019/config/apps/http/servers/srv0/routes/
The final configuration is in the cache_20_sites.config. I got to that by doing the above operation 20 times, with different hostnames.
A also included an svg output from go pprof showing the alloc_space after adding the 20 sites one by one, named 20_sites_onebyone.svg
After saving the configuration using:
curl http://192.168.6.31:2019/config/ > cache_20_sites.config
I restarted caddy and posted the whole config right back:
curl -X POST -H "Content-Type: application/json" -d @cache_20_sites.config http://192.168.6.31:2019/config/
Then I took another snapshot with pprof. This can be found in the file 20_sites_allatonce.svg
bugreport.zip
Could someone please help me with this?