aws / amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
https://aws.amazon.com/systems-manager/
Apache License 2.0
1.04k stars 322 forks source link

Unable to send instance logs to CloudWatch Logs (SSM Agent) #382

Closed tangzhiqiangh closed 2 years ago

tangzhiqiangh commented 3 years ago

error 2021-05-31 01:12:04,649 [43] ERROR [aws:cloudWatch] - Unable to upload log events to the log stream jump1 in the log group 'Windows-Event': A WebException with status SecureChannelFailure was thrown. : 2021-05-31 01:12:04,650 [43] WARN [aws:cloudWatch] - Failed to UploadLogs to CloudWatch Log Service for log group Windows-Event and log stream jump1. 2021-06-01 06:29:27,585 [1] INFO [Framework] - Configuration does not have AccessKey or SecretKey , try to use IAM credentials instead. 2021-06-01 06:29:27,606 [1] INFO [Framework] - Configuration does not have AccessKey or SecretKey , try to use IAM credentials instead. 2021-06-01 06:29:27,649 [1] INFO [Framework] - aws:cloudWatch plugin configuration verified 2021-06-01 06:29:27,655 [1] INFO [aws:cloudWatch] - CloudWatch execution started. 2021-06-01 06:29:27,658 [1] INFO [aws:cloudWatch] - Starting the CloudWatch plug-in 2021-06-01 06:29:27,660 [1] INFO [aws:cloudWatch] - Starting the CloudWatch Logs engine 2021-06-01 06:29:27,673 [16] INFO [aws:cloudWatch] - Registry key to store EventRecordId does not exist. The key will be recreated and only event logs generated from 1 minute ago will be uploaded. 2021-06-01 06:29:27,675 [19] INFO [aws:cloudWatch] - Registry key to store EventRecordId does not exist. The key will be recreated and only event logs generated from 1 minute ago will be uploaded. 2021-06-01 06:29:27,673 [17] INFO [aws:cloudWatch] - Registry key to store EventRecordId does not exist. The key will be recreated and only event logs generated from 1 minute ago will be uploaded. 2021-06-01 06:29:27,673 [18] INFO [aws:cloudWatch] - Registry key to store EventRecordId does not exist. The key will be recreated and only event logs generated from 1 minute ago will be uploaded. 2021-06-01 06:29:27,673 [15] INFO [aws:cloudWatch] - Registry key to store EventRecordId does not exist. The key will be recreated and only event logs generated from 1 minute ago will be uploaded. 2021-06-01 06:29:27,693 [_Worker-1] INFO [aws:cloudWatch] - CloudWatch execution started. 2021-06-01 06:29:27,695 [_Worker-1] INFO [aws:cloudWatch] - Starting the CloudWatch plug-in 2021-06-01 06:29:27,696 [_Worker-1] INFO [aws:cloudWatch] - Starting the CloudWatch Logs engine 2021-06-01 06:29:28,129 [18] ERROR [aws:cloudWatch] - Unable to upload log events to the log stream test1 in the log group 'Windows-Event': A WebException with status SecureChannelFailure was thrown. : 2021-06-01 06:29:28,131 [18] WARN [aws:cloudWatch] - Failed to UploadLogs to CloudWatch Log Service for log group Windows-Event and log stream test1.

ferkhat-aws commented 3 years ago

Hello. Can you tell us which version of SSM Agent you have installed?

tangzhiqiangh commented 3 years ago

你好。您能告诉我们您安装了哪个版本的SSM Agent?

Amazon SSM Agent 3.0.222.0

tangzhiqiangh commented 3 years ago

Hello. Can you tell us which version of SSM Agent you have installed?

Hello, how can I solve it

danr-amz commented 3 years ago

The SecureChannelFailure error indicates a TLS connection failure. Can you verify that the CA certificates listed on the page below are installed on your system?

https://www.amazontrust.com/repository/

If you have these CA certs installed and the problem persists, can you please provide the following information to help us troubleshoot? Thanks.

danr-amz commented 3 years ago

As a side note, the current recommendation is to migrate away from using SSM Agent to send logs to CloudWatch, and use the unified CloudWatch agent instead. Details here: https://docs.aws.amazon.com/systems-manager/latest/userguide/monitoring-cloudwatch-agent.html

tangzhiqiangh commented 3 years ago

SecureChannelFailure 错误表示 TLS 连接失败。您能否验证您的系统上是否安装了以下页面上列出的 CA 证书?

https://www.amazontrust.com/repository/

如果您安装了这些 CA 证书并且问题仍然存在,能否请您提供以下信息以帮助我们进行故障排除?谢谢。

  • 操作系统版本
  • 主机类型(例如 EC2 实例、本地主机)
  • 此主机与 CloudWatch Logs API 服务端点(例如 HTTP/HTTPS 代理或 VPC 端点)之间是否存在任何内容?

    • CloudWatch Logs API 服务端点应位于 logs.[REGION].amazonaws.com。例如,如果您的主机位于 us-east-1 区域,则端点应为 logs.us-east-1.amazonaws.com。
  • CloudWatch 日志上传曾经在这台主机上工作过吗?如果是这样,它是什么时候停止工作的?日志停止工作时还有什么变化吗?

The cloudwatch agent is still running and uploading logs, but the ssm agent cannot run, and several servers have stopped, but several of them are still running and uploading logs using both the ssm agent and the cloudwatch agent.

tangzhiqiangh commented 3 years ago

附带说明一下,当前的建议是不要使用 SSM 代理将日志发送到 CloudWatch,而是使用统一的 CloudWatch 代理。详情请见:https : //docs.aws.amazon.com/systems-manager/latest/userguide/monitoring-cloudwatch-agent.html

The cloudwatch agent is still running and uploading logs, but the ssm agent cannot run, and several servers have stopped, but several of them are still running and uploading logs using both the ssm agent and the cloudwatch agent.cloudwatch What's the matter of successful configuration? There is no secret key information, where to modify

PS C:\Program Files\Amazon\AmazonCloudWatchAgent> .\amazon-cloudwatch-agent-ctl.ps1 -a fetch-config -m ec2 -c file:config.json -s Successfully fetched the config and saved in C:\ProgramData\Amazon\AmazonCloudWatchAgent\Configs\file_config.json.tmp Start configuration validation... 2021/06/03 10:17:33 Reading json config file path: C:\ProgramData\Amazon\AmazonCloudWatchAgent\Configs\file_config.json.tmp ... Valid Json input schema. No csm configuration found. Configuration validation first phase succeeded Configuration validation second phase succeeded Configuration validation succeeded AmazonCloudWatchAgent has been stopped AmazonCloudWatchAgent has been started

PS C:\Program Files\Amazon\AmazonCloudWatchAgent> & $Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1 -m ec2 -a status { "status": "running", "starttime": "2021-06-03T10:17:33", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247347.5b250583"

But there is no upload log, how to solve it

danr-amz commented 2 years ago

Since logs are being uploaded by both the SSM agent and the CloudWatch agent, we should check the config on the host to make sure that the SSM agent does not have CloudWatch log upload enabled.

Can you please check the file %programfiles%\Amazon\SSM\Plugins\awsCloudWatch\AWS.EC2.Windows.CloudWatch.json? If the file does not exist, that is good. If the file does exist, please ensure that you see "IsEnabled": false.

Also, can you check the file %programfiles%\Amazon\SSM\seelog.xml? If the file does not exist, that is fine. If the file does exist, then it should NOT contain the following line:

<custom name="cloudwatch_receiver" formatid="fmtdebug" data-log-group="your-CloudWatch-log-group-name"/>

For reference, the full steps to migrate log upload from SSM agent to CloudWatch agent are here: https://docs.aws.amazon.com/systems-manager/latest/userguide/monitoring-cloudwatch-agent.html#monitoring-cloudwatch-agent-migrate

VishnuKarthikRavindran commented 2 years ago

Feel free to reopen if the issue persists