cds-snc / notification-planning-core

Project planning for GC Notify Core Team
0 stars 0 forks source link

Prevent the creation of additional private certificate authorities on dev destroy/create #439

Closed ben851 closed 1 day ago

ben851 commented 1 month ago

Description

As a developer of Notify, I want to minimize our costs when creating/deleting dev environments so that Pat stops yelling at us.

WHY are we building?

When we recreate the VPN certificate authority in AWS, it costs several hundred dollars.

WHAT are we building?

Rather than creating new private certificate authorities after aws-nuke, we should just import the existing ones.

VALUE created by our solution

This will save us several hundred dollars per month.

Acceptance Criteria

QA Steps

ben851 commented 1 month ago

We will have to rescope this card as it would be very difficult to have AWS Nuke avoid the VPN code when nuking the environment.

Instead, we can simply manage the private certificate authority which is what is costing the big bucks. The upside is that aws-nuke wasn't deleting this resource already, so it is super simple to just import it at terraform deploy. The downside is that we will still have to redownload the VPN config every time we nuke the dev environment.

sastels commented 1 month ago

PR merging today

ben851 commented 1 month ago

Merged, ready for QA

sastels commented 1 month ago

Steve will QA!

ben851 commented 4 weeks ago

The initial QA failed as the Terragrunt wouldn't allow the import through the CLI. I moved it to an import block. That failed on the create run at first, but after it was run a second time, it worked with no changes. Given this, I'm going to make the TF Apply retryable and @sastels can do a new round of destroy/create to QA again

sastels commented 4 weeks ago

rerunning destroy...

ben851 commented 3 weeks ago

QA on thursday failed as well but it seems to have been an AWS problem - rerunning the same code today works fine. I'm finishing the environment creation again and then we can try AGAIN

ben851 commented 3 weeks ago

The delete/create script has been very fragile in that sometimes it works and sometimes random (never the same twice) errors occur. Waiting and re running failed modules make it miraculously work again. I would say proceed with QA'ing this now since the delete/create scripts are out of scope here.

I can then move on with implementing a schedule for dev delete/create and we can see how this looks when the scripts are not executed 30 times a day.

jimleroyer commented 3 weeks ago

@sastels Ready for you to QA again!

sastels commented 3 weeks ago

Will verify next week after the destroy/create automated run

ben851 commented 2 weeks ago

Create/destroy worked - please proceed with QA

sastels commented 2 weeks ago

Looks good! :tada: