aws / amazon-ecs-service-connect-agent

Amazon ECS Service Connect Agent
Apache License 2.0
27 stars 10 forks source link

Change SC agent health compute logic to health flip #48

Closed Penghaow closed 11 months ago

Penghaow commented 11 months ago

Summary

The AppNet agent currently monitors its connection to the EMS, marking itself as unhealthy if it disconnects from the control plane EMS for 3 hours. However, if the relay container exits and fails to restart, possibly due to resource constraints, the AppNet agent loses its connection to the EMS. Yet, the task will only be marked as unhealthy after a grace period of 3 hours.

This change is to change the current SC health check compute logic to health flip with a initial threshold of 5 to detect the failure impact of relay agent earlier.

SIM: https://sim.amazon.com/issues/LATTICE-BE-10167

Testing

New tests cover the changes: YES Manually build at the local, and it works fine.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.