In a previous PR, we updated our evaluation of TASK_LOST during a deploy to allow for certain lost tasks, specifically tasks lost because of the container resource limit bug, to retry. This had the unintended effect of getting other deploys with tasks lost stuck in a loop of waiting, because that task wasn't being relaunched and wasn't being counted as failed. Since the source of the container resource limit bug has been found, we are just removing this logic and going to count all lost tasks as failed.
In a previous PR, we updated our evaluation of
TASK_LOST
during a deploy to allow for certain lost tasks, specifically tasks lost because of the container resource limit bug, to retry. This had the unintended effect of getting other deploys with tasks lost stuck in a loop of waiting, because that task wasn't being relaunched and wasn't being counted as failed. Since the source of the container resource limit bug has been found, we are just removing this logic and going to count all lost tasks as failed.