elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.22k stars 524 forks source link

Set GOMEMLIMIT when running in a containerized environment #14475

Open marclop opened 3 weeks ago

marclop commented 3 weeks ago

Descripiton

how do we look at setting soft memory limit for apm-server https://pkg.go.dev/runtime/debug#SetMemoryLimit based on total memory %.

@1pkg brought up that we should probably be setting GOMEMLIMIT to 85-90% of the container memory limit (or the detected total memory when not running in containerized environments). It would avoid the APM server running out of memory when memory usage approaches that GOMEMLIMIT.

If GOMEMLIMIT is set, then we should respect the setting and not set it ourselves in the code.

Arguably, this is something that elastic-agent could do, but if we do it in APM Server, then this limit would apply to the process regardless of the mode/manner it is run.

simitt commented 1 week ago

@marclop how does this play together with the Elastic Agent potentially running multiple other processes? The GOMEMLIMIT could still be set too high in such cases.

marclop commented 1 week ago

@simitt we'd be setting the memory limit in-process, so it'd only apply to APM Server, and this doesn't take into consideration elastic-agent. Net, net it's better than what we have at the moment.

The fact that the elastic-agent has multiple processes running in a single cgroup is problematic of course, and we can only do so much about it.

We could be very conservative and set the limit to ~80-85% of the actual container's memory, but once we get close to that threshold the GC will start being very aggressive and force more and more GC cycles to keep the memory in check, so it may be detrimental in cases where there's high memory utilization. Perhaps a 90% limit would be preferable.