aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 319 forks source link

[ECS] [request] How to see the detail health check output if aws console shows `failed container health checks` #1553

Open Colstuwjx opened 3 years ago

Colstuwjx commented 3 years ago

Summary

I'm using aws ecs service, and once the service health check failed, I ONLY see failed container health checks in the console, and no detail log found

Description

Aws console only shows failed container health checks, we need detail output. I found the code did have output field, could we find the detail output from the console, or by api?

Expected Behavior

Could check the detail health check failed log, e.g. curl 127.0.0.1:80 connection refused ..., rather than an line of failed container health checks. Due to lack of information, people doesn't know how to fix that issue...

Observed Behavior

Only show failed container health checks in aws console.

Environment Details

Supporting Log Snippets

ubhattacharjya commented 2 years ago

Hi @Colstuwjx ,

Would you be able to run the ecs logs collector: https://github.com/aws/amazon-ecs-logs-collector on the affected instance? You can email the bundle to ecs-agent-external@amazon.com. That would help us to know why the healthcheck failed.

Colstuwjx commented 2 years ago

I've been running ecs service with Fargate type, so there is no way to touch the node. Do we support run ecs agent locally? I'd like to run the service and ecs agent locally, and inject some debug code to dive into the internals.

yinyic commented 2 years ago

Hi @Colstuwjx ,

Please check out the troubleshooting guide at https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-health-check-failures/ (please navigate to Troubleshoot failed container health checks section). Specifically, running the task with EC2 launch type allows one to access the underlying instance.

Please let us know if you have any questions about the troubleshooting steps.

Colstuwjx commented 2 years ago

Hi @yinyic ,

Thanks for your reply! I'm running ecs service with Fargate type, so I can't read the detail log via exec into the ec2 instance. Besides, I can't see any logs in cloudwatch logs since the containers of the service stuck into the fail loop ( start -> health check failed -> unhealthy -> killed -> start ).

A suggestion: we could show more detail logs in the aws console, e.g. failed container health check: curl 127.0.0.1:80 connection refused .... I'm not sure if it's feasible to make this change, haha.

fenxiong commented 2 years ago

Hi, Sorry for the late update - this looks like a feature request to ECS console and/or ECS service scheduler. Going to transfer the issue to container roadmap for tracking.

ducmthai commented 2 years ago

We also have the same issue with a critical application running in Fargate mode, although the task CPU and memory utilization are just below 30%. We opened a support ticket with AWS and the representative sent us here to get more attention on this particular issue. - Thanks