I attempted to delete one of our PR test clusters via deletion of the main stack. The nested cluster stack deleted successfully, but the nested vpc stack failed to delete, which caused the main stack to fail deletion as well.
The vpc stack fails to delete due to the created subnets being still in-use by Stopped EC2 instances.
Steps to Reproduce
Provision an OCP cluster using our AWS CloudFormation templates
Manually shut-down/stop some OCP nodes (worker/master, shouldn't matter), or use the automated shut-down script
Attempt to delete the main CloudFormation Stack
Encounter failure deleting vpc stack.
Expected behavior
Cloudformation stack deletion should be able to handle scenarios where the OCP nodes are in a Stopped state.
Additional context
I assume there are changes we need to make to the CleanupLambda, destroy.sh or terraform destroy logic to handle this scenario.
Also the created Public Subnet seems to not be deleted due to Network Interfaces and ELBs that are dependencies. We'd need to accommodate for any/all dependencies.
Description
I attempted to delete one of our PR test clusters via deletion of the
main
stack. The nestedcluster
stack deleted successfully, but the nestedvpc
stack failed to delete, which caused themain
stack to fail deletion as well.The
vpc
stack fails to delete due to the created subnets being still in-use byStopped
EC2 instances.Steps to Reproduce
main
CloudFormation Stackvpc
stack.Expected behavior
Cloudformation stack deletion should be able to handle scenarios where the OCP nodes are in a Stopped state.
Additional context
I assume there are changes we need to make to the
CleanupLambda
,destroy.sh
orterraform destroy
logic to handle this scenario.https://github.com/IBM/zmodstack-deploy/blob/dev/aws/cloudformation/cluster.yml#L585-L643 https://github.com/IBM/zmodstack-deploy/blob/dev/aws/cloudformation/cluster.yml#L425-L431