aws / amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
https://aws.amazon.com/systems-manager/
Apache License 2.0
1.06k stars 325 forks source link

High CPU usage on platform version: "64bit Windows Server 2016 running IIS 10.0 2.11.5" #525

Closed dazbradbury closed 1 year ago

dazbradbury commented 1 year ago

We seem to have run into an issue where on the latest platform version, the amazon-ssm-agent process consumes an unreasonably large amount of CPU. This happens intermittently, but when it does it causes service degradation on the web worker it's happening to:

image

image

We have rolled back to "64bit Windows Server 2016 running IIS 10.0 2.11.4", and the issue hasn't re-surfaced (yet). This is a production application, so providing further information is non-trivial, but it does seem there is an issue with this process in conjunction with the latest platform version. If this isn't something AWS are already aware of, and do need further debug data, let me know and I'll do my best to supply it.

armnejad commented 1 year ago

What is the AMI ID that was being used before you rollled back? Are you able to provide the logs from the time the spike is happening? https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-agent-logs.html

dazbradbury commented 1 year ago

The AMI is: ami-0f0a57933c23aec26

However, we've now rebuilt the entire environment and rolled forward to the latest platform version (AMI matching above), and the issue seems to have stopped for now. I'll have to keep an eye on it and see if this re-surfaces, but without knowing exactly how amazon-ssm-agent interacts with the rest of the Elastic Beanstalk stack, it's going to be hard to know the exact cause.

I'm afraid I don't have the agent logs, as the instances were terminated to prevent disruption in our production environment. I'll close this issue for now and re-open if / when it re-occurs.