aws / amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
https://aws.amazon.com/systems-manager/
Apache License 2.0
1.06k stars 322 forks source link

Completed command invocation fails to report back to SSM #316

Open stekern opened 4 years ago

stekern commented 4 years ago

Background

I have a set of SSM associations that are automatically run on newly registered servers, where:

I need the shell script to run last, and I've implemented a simple shell mechanism that waits until all other commands have completed before continuing. (Not sure if my shell script or this waiting mechanism is relevant to the issue I'm having -- associations backed by AWS-managed documents also have the same issue.)

Issue

When registering a new server with SSM, some of the Run Commands triggerd by these associations fail to report success or failure to AWS SSM. Even though I can verify on the server that the command actually finished (stderr and stdout contain the expected output), the Run Commands are listed with a status of DeliveryTimedOut. The issue seem to occur sporadically -- sometimes all the Run Commands run successfully on a newly registered server.

My first thought is that this is related to the the SSM agent being restarted as part of the AWS-UpdateSSMAgent, and that this affects other commands running on the server. Or perhaps there is a bug in the version of the SSM agent that I'm updating from?

Log

There's a specific error message I'm getting when this happens. Contents of /var/log/amazon/ssm/errors.log:

2020-10-21 10:37:20 ERROR [GetDocumentState @ docmanager.go.121] [ssm-agent-worker] [MessagingDeliveryService] [EngineProcessor] encountered error with message invalid character '"' after top-level value while reading Interim state of command from file - <REDACTED-COMMAND-ID-1>
2020-10-21 10:37:20 ERROR [ssm-agent-worker] [OfflineService] [EngineProcessor] encountered error with message invalid character '"' after top-level value while reading Interim state of command from file - <REDACTED-COMMAND-ID-2>
gianniLesl commented 3 years ago

Can you please open a support ticket in the AWS Console and attach the full error.log and amazon-ssm-agent.log so that we can see the context of this error?

https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-agent-logs.html https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case