elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.22k stars 522 forks source link

Slow response of APM Server not being reflected in Stack Monitoring UI #7504

Open simitt opened 2 years ago

simitt commented 2 years ago

With the switch to the new ES output handling, the APM Server changes its behavior when being overloaded. Instead of returning 503 - Queue is full errors, it starts responding much slower to APM agent requests. This causes APM agents to eventually close their connection and log errors. The APM Server itself does not issue any log lines indicating that it is overloaded and doesn't record error metrics. The Stack Monitoring UI doesn't give indicators that the server is overloaded, except for tracking a higher memory usage (because of the requests being buffered in memory).

Parts that should be improved:

simitt commented 2 years ago

Scope for 8.3: