Slow response of APM Server not being reflected in Stack Monitoring UI

With the switch to the new ES output handling, the APM Server changes its behavior when being overloaded. Instead of returning 503 - Queue is full errors, it starts responding much slower to APM agent requests. This causes APM agents to eventually close their connection and log errors. The APM Server itself does not issue any log lines indicating that it is overloaded and doesn't record error metrics. The Stack Monitoring UI doesn't give indicators that the server is overloaded, except for tracking a higher memory usage (because of the requests being buffered in memory).

Parts that should be improved:

make the number of max requests configurable (currently hardcoded to 10) (https://github.com/elastic/apm-server/issues/7719).
allow customizing the yaml box for the Elastic Cloud output via Fleet; since 8.0 a dedicated cloud output is configured, avoiding public traffic and any configuration on it is frozen
record metrics indicating that more events are processed than can be ingested to ES; for example track how many available channels are created and when a new channel is available for processing events.
add log warnings events are queued up
add information to Stack Monitoring UI or ship with pre-built monitoring visualizations

elastic / apm-server

Slow response of APM Server not being reflected in Stack Monitoring UI #7504