Open keithduncan opened 3 years ago
We're also experiencing this when instances running in our elastic stack run low on memory. This leaves our instances in a zombie state as they seem to occupy capacity in our autoscaling group without processing any jobs. Sometimes the instances are cleaned up automatically but we've seen instances alive for close to a week with no agents.
While we can reduce the number of agents per instance and/or increase the instance class we'd like the agents to stay alive in some form, so bumping this issue! 😄
We want the agent process to remain alive but the bootstrap and any processes under it in the systemd service should be candidates for being killed under memory pressure 🤔