Closed RothAndrew closed 5 months ago
Had 2 more instances wince writing this where orphaned resources were created and I had to spend time deleting them manually
https://github.com/lianghong/delete_vpc was helpful, though it didn't work out of the box, I had to update the script a little bit to get it to work.
Agreed it doesn't work out of the box, however it did help find the dependencies that did have issues. Ask Blake about the script we found that may help a bit more with deleting resources.
On Mon, Mar 13, 2023 at 4:35 PM Andy Roth @.***> wrote:
https://github.com/lianghong/delete_vpc was helpful, though it didn't work out of the box, I had to update the script a little bit to get it to work.
— Reply to this email directly, view it on GitHub https://github.com/defenseunicorns/iac/issues/89#issuecomment-1466918367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWK6CJ3F36HM6T7S5QNFP3W36AHHANCNFSM6AAAAAAVMYNPNM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
+1 for the periodic nuke
recommend having people tag resources with narwhal-<person's name> and we exclude those from the nuke
we'll also need to add a periodic shaming to ask if peeps have things that aren't actively being developed (possibly at the end of retro)
we can also look into automatically suspending resources based on time of day (i.e. a certain list of resources tagged with narwhal-* get automatically suspended at 6pm MST)
@mjnagel do you have insights into what BB / P1 is using to manage dev costs?
+1 for the periodic nuke
recommend having people tag resources with narwhal-<person's name> and we exclude those from the nuke
we'll also need to add a periodic ~shaming~ to ask if peeps have things that aren't actively being developed (possibly at the end of retro)
Works for me, though let's use something more generic, like "is-keeper == true"
@mjnagel do you have insights into what BB / P1 is using to manage dev costs?
From a quick check with them sounds like kubecost/opencost somewhat for monitoring things to some extent and then homegrown scripts run on lambda for cleanup. They're looking at getting kubecost SMEs to help configure things better for them.
@RothAndrew dumb question. Wipe every Monday at 1 am EST? Total account nuke no matter what? Or make it 3:30 am EST?
@RothAndrew dumb question. Wipe every Monday at 1 am EST? Total account nuke no matter what? Or make it 3:30 am EST?
I don't particularly care how often it is done or what time of day it is done (as long as it is sometime in the dead of night for both east coast and west coast). If it doesn't cause significant disruptions we might as well do it daily.
If it does cause significant disruptions we shouldn't do it at all until we have figured out how to do it without causing significant disruptions.
Persona
I'm a maintainer of this repo. I'm submitting this on behalf of Defense Unicorns leadership, who want to ensure that the money we spend in our dev/test AWS account(s) is being spent well.
Description
Periodically (frequency TBD), automatically destroy all resources in our dev/test AWS account that aren't specifically identified as being permanent resources.
Use Case
This is needed because we frequently get orphaned resources in our AWS account. A big part of what we do is making rapid changes to Terraform code. We test those changes frequently, and when tests fail, there is a chance that the resources don't get cleaned up properly.
Impact
According to the billing console, the stuff that is running in the account right now is costing about $100 per day. I don't believe we have any tests actively running in the account right now, so the likelihood is that most of that $100 per day is from orphaned resources that haven't been cleaned up yet.
The impact is, that we continue to "light dollar bills on fire", or we force members of the team to continue to manually go through and delete resources, which is labor intensive and prone to mistakes.
Completion
Additional Context
Original description:
My session token expired in the middle of an apply and I lost the terraform state. I'm now going through and having to delete hundreds of things manually.
The AWS account we are using doesn't have anything permanent in it. We should set up the ability to nuke all resources in the account (with perhaps just a few exceptions, like the GitHub Actions auth provider and role)
https://github.com/rebuy-de/aws-nuke works well for this kind of thing.