aws / amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
https://aws.amazon.com/systems-manager/
Apache License 2.0
1.04k stars 322 forks source link

InvalidInstanceId: Not existing InstanceId #381

Closed johan-smits closed 3 years ago

johan-smits commented 3 years ago

I have a old Ubuntu 14:04 server that has been upgraded through the years to 20.04. The node has a old (short) instance ID and when I want to start the ssm agent I get this error:

2021-05-26 13:51:34 INFO [ssm-agent-worker] Dial to Core Agent broadcast channel
2021-05-26 13:51:34 INFO [ssm-agent-worker] Start to listen to Core Agent termination channel
2021-05-26 13:51:34 INFO [ssm-agent-worker] Dial to Core Agent broadcast channel
2021-05-26 13:51:34 INFO [ssm-agent-worker] Start to listen to Core Agent health channel
2021-05-26 13:51:34 INFO [ssm-agent-worker] Create new startup processor
2021-05-26 13:51:34 INFO [ssm-agent-worker] [StartupProcessor] Executing startup processor tasks
2021-05-26 13:51:34 INFO [ssm-agent-worker] [StartupProcessor] Write to serial port: Amazon SSM Agent v3.0.529.0 is running
2021-05-26 13:51:34 INFO [ssm-agent-worker] [StartupProcessor] Write to serial port: OsProductName: Ubuntu
2021-05-26 13:51:34 INFO [ssm-agent-worker] [StartupProcessor] Write to serial port: OsVersion: 20.04
2021-05-26 13:51:34 INFO [ssm-agent-worker] Entering SSM Agent hibernate - error occurred in RequestManagedInstanceRoleToken: InvalidInstanceId: Not existing InstanceId mi-0c17d31c2af36e0e6

Note that it shows a long instance ID that is not the one of the node. It fails to register in the portal.

I have removed the old package and installed the snap version of the ssm manager.

gianniLesl commented 3 years ago

The managed instance id is generated and stored on the server side at registration time and passed back to the agent to be stored in /opt/aws/ssm/data/, so I'm not sure why the managed instance id would on your ubuntu box without manual interference or reregistration. You can try editing the managed instance id in /opt/aws/ssm/data/Vault/Store/RegistrationKey to the correct managed instance id but the fact that it has changed makes me question the validity of the rest of that file (which contains other registration info to validate against). Please deregister your server, create a new activation, and register again to get your server back to a normal state.

johan-smits commented 3 years ago

@gianniLesl thanks for the suggestion, when I do this it give this error in the log:

Entering SSM Agent hibernate - error occurred in RequestManagedInstanceRoleToken: ValidationException: 2 validation errors detected: Value 'mi-df9b5838' at 'instanceId' failed to satisfy constraint: Member must have length greater than or equal to 20; Value 'mi-df9b5838' at 'instanceId' failed to satisfy constraint: Member must satisfy regular expression pattern: ^mi-\w{17}$
    status code: 400, request id: d731d459-f82c-4a83-954b-f559811104d7

Removing the file makes the node pop-up but I have this in the logs now:

2021-05-28 07:12:41 WARN error while loading server info%!(EXTRA *errors.errorString=Failed to load instance info from vault. Data file of RegistrationKey is missing.)
gianniLesl commented 3 years ago

Are you able to deregister the machine and register with ssm again?

johan-smits commented 3 years ago

@gianniLesl how can I do this?

VishnuKarthikRavindran commented 3 years ago

Hi @johan-smits, Please follow steps 4-6 in the below link for registration and deregistration -

johan-smits commented 3 years ago

@VishnuKarthikRavindran The deregistration option is greyed out from the console. But this option is for all instances the same.

gianniLesl commented 3 years ago

Create a new activation in Systems Manager and on your instance run sudo amazon-ssm-agent -register -code "{activation-code}" -id "{activation-id}" -region "{activation-region}" Reply "Yes" when the agent responds "Instance already registered. Would you like to override existing with new registration information?" If that does not work try running sudo amazon-ssm-agent -clear and running the register command again