winglang / wing

A programming language for the cloud ☁️ A unified programming model, combining infrastructure and runtime code into one language ⚡
https://winglang.io
Other
4.98k stars 196 forks source link

SDK Spec Tests Leave Orphaned Resources When Interrupted #5921

Open hasanaburayyan opened 6 months ago

hasanaburayyan commented 6 months ago

When the SDK Spec tests run they use Terraform to deploy infra, then once the tests complete run a Terraform destroy to clean up those resources.

In our current release workflow, we run the SDK spec tests as part of the release quality checks. However, since the workflow's behavior for consecutive releases is to cancel running jobs and start a new release workflow (bundling the releases together) this can result in terraform destroy commands never being run. Which leaves behind orphaned resources in our AWS account.

As well the state files used for these spec tests are using local backends so once the action is gone we cant even recover the state file to perform destroys manually, which creates a cumbersome cleanup process.

Possible Solutions

  1. Enhance wing test to handle sigterm for cleanup
  2. Run nuke scripts in our CI accounts on some cadence

Slack thread: https://winglang.slack.com/archives/C04AUK4P5N2/p1710264284126939?thread_ts=1710264251.032359&cid=C04AUK4P5N2

hasanaburayyan commented 6 months ago

Even if we run nuke scripts, we probably want wing test to handle sigterm as this could happen even outside of our release process.

staycoolcall911 commented 6 months ago

I'll just add a list @tsuf239 created at some point:

tsuf239 commented 6 months ago

I currently trying to do a manual clean-up every 2 weeks, and I have scripts that can be adjusted and used daily (excluding the current day each time)

github-actions[bot] commented 3 months ago

Hi,

This issue hasn't seen activity in 90 days. Therefore, we are marking this issue as stale for now. It will be closed after 7 days. Feel free to re-open this issue when there's an update or relevant information to be added. Thanks!

tsuf239 commented 3 months ago

I already made one action for cleaning left out azure resources :) #6666 I'll make one for aws as well (and for gcp when will be included in the spec tests)