vmware-tanzu / tanzu-framework

Tanzu Framework provides a set of building blocks to build atop of the Tanzu platform and leverages Carvel packaging and plugins to provide users with a much stronger, more integrated experience than the loose coupling and stand-alone commands of the previous generation of tools.
Apache License 2.0
196 stars 193 forks source link

Feedback during AWS deployment without cloudformation stack in place #735

Open arpithahg opened 3 years ago

arpithahg commented 3 years ago

Feedback

There are two issues wrt cloudformation stack:

  1. As of today, UI has cloudformation stack creation as optional parameter. If someone doesn't have that in place, we still go ahead with bootstrap cluster creation and then during AWS deployment, we hit failure with issue "InstanceProvisionFailed @ Machine/" which isn't very easy to debug.

  2. Since the cloudformation stack is local to region whereas the resources created in it are global, changing the region to some other region during deployment forces us to delete the cloudformation stack from the original region. Else the stack creation in new region fails saying IAM roles already exist. This limits the user from using multiple regions.

Suggested Changes

  1. Since we get the AWS credentials during the setup from UI or CLI, even before we go ahead with bootstrap cluster, we can do a basic check using AWS CLI whether cloudformation stack exists. This way, we will save atleast 30 mins of deployment time and also error msg will be very clear to remediate.

  2. New cloudformation stack created in different regions should use existing IAM roles and other global resources that already exist in the account.

Environment Details

miclettej commented 3 years ago

In addition, we should also check if Cloudformation stack already exists. If user sets the Cloudformation stack checkbox to checked, we are trying to create the stack again before checking if it already exists. This results in a hard failure of the deployment.

randomvariable commented 3 years ago

Since we get the AWS credentials during the setup from UI or CLI, even before we go ahead with bootstrap cluster, we can do a basic check using AWS CLI whether cloudformation stack exists.

Except you have to enumerate every single region, which might change as AWS adds/removes them, and could get blocked by permissions. Probably a better way would be to check for a known IAM resource, as these are the global resources we care about and they have a consistent name.

PS: Tanzu Framework should never be executing the aws CLI. It should either make the calls directly, or ask for changes upstream in https://github.com/kubernetes-sigs/cluster-api-provider-aws/tree/main/cmd/clusterawsadm

Furthermore, the CLI could, in co-operation with upstream changes gather the cloudformation events and provider clearer errors about what's happened without the user having to go to the cloudformation console.

dsu-igeek commented 2 years ago

Another option would be to embed the region name in the global resource names so that we can have different cloudformation stacks in each region. MIght want to consider adding a version number as well. I had an oldtkg-cloud-vmware-com cloudformation stack lying around from previous installs in one region which both interfered with creating stacks in new regions and broke new installs in that region (I think)

randomvariable commented 2 years ago

Another option would be to embed the region name in the global resource names so that we can have different cloudformation stacks in each region. MIght want to consider adding a version number as well. I had an old tkg-cloud-vmware-com cloudformation stack lying around from previous installs in one region which both interfered with creating stacks in new regions and broke new installs in that region (I think)

Possibly, we'd want to make the same change upstream probably and aid a migration of some sort.