IBM / zmodstack-deploy

IBM Z & Cloud Modernization Stack deployment tools
Apache License 2.0
4 stars 3 forks source link

AWS CloudFormation Stack fails to delete if OCP nodes are in a Stopped state #38

Open ivandov opened 1 year ago

ivandov commented 1 year ago

Description

I attempted to delete one of our PR test clusters via deletion of the main stack. The nested cluster stack deleted successfully, but the nested vpc stack failed to delete, which caused the main stack to fail deletion as well.

The vpc stack fails to delete due to the created subnets being still in-use by Stopped EC2 instances.

Steps to Reproduce

  1. Provision an OCP cluster using our AWS CloudFormation templates
  2. Manually shut-down/stop some OCP nodes (worker/master, shouldn't matter), or use the automated shut-down script
  3. Attempt to delete the main CloudFormation Stack
  4. Encounter failure deleting vpc stack.
image image image

Expected behavior

Cloudformation stack deletion should be able to handle scenarios where the OCP nodes are in a Stopped state.

Additional context

I assume there are changes we need to make to the CleanupLambda, destroy.sh or terraform destroy logic to handle this scenario.

https://github.com/IBM/zmodstack-deploy/blob/dev/aws/cloudformation/cluster.yml#L585-L643 https://github.com/IBM/zmodstack-deploy/blob/dev/aws/cloudformation/cluster.yml#L425-L431

ivandov commented 1 year ago

Also the created Public Subnet seems to not be deleted due to Network Interfaces and ELBs that are dependencies. We'd need to accommodate for any/all dependencies.