ECS deployment succeeds despite Load Balancer health "Unknown"

jwach commented 2 months ago

Issue Summary:

Pipeline deployment of the ECS provider succeeds right after the ECS task is up & healthy, but ignores the health status of the LB's Target Group being "Unknown".

Cloud Provider(s):

AWS ECS

Environment:

Spinnaker 1.33.0 (self-hosted on Ubuntu).

Feature Area:

Pipelines.

Description:

The application is configured with the option Consider only cloud provider health when executing tasks disabled, so it should take Load Balancer's Target Group health into account when deploying. Screenshot 2024-04-18 at 15 18 07

During pipeline deploy, when new sever group is created, Spinnaker proceeds to destroy the old server group right after the ECS task health is Up, despite Load Balancer health still being Unknown. Screenshot 2024-04-18 at 15 20 07

This leads to traffic being directed to unhealthy instances and alerts from CloudWatch on HealthyHostCount being 0.

ECS task definition does not have its own healthcheck defined.

This used to work correctly before, but I've made a long overdue upgrade from version 1.20 to 1.33.0 and it broke.

Expected behaviour

Old server group is destroyed only after both health checks for the new server group are "Up".

Steps to Reproduce:

Create a deployment pipeline with ECS provider and Highlander strategy.
Execute the pipeline.

jwach commented 2 months ago

The pipeline JSON also has the health check type set to EC2.

spinnakerbot commented 1 month ago

This issue hasn't been updated in 45 days, so we are tagging it as 'stale'. If you want to remove this label, comment:

@spinnakerbot remove-label stale

saranyatavva commented 4 weeks ago

Hi, We are also facing the same issue. Deployment of the ECS succeeds even when LB health isunknown. We have similar settings like above (Consider only cloud provider health when executing tasks is disabled at our end). Does anyone know why spinnaker is not checking the LB health status?

spinnaker / spinnaker