Open michaelpetersubc opened 4 years ago
Hi, @michaelpetersubc!
As far as I can see, deploying again recreates everything. For example, the api gateway would change.
Could you please provide the steps you used to deploy DCE? There are ways to make updates in place and even tear down (destroy) what you created, but the exact steps depend on how you deployed DCE.
Best regards, @nathanagood
Thanks @nathanagood
If I am understanding the question correctly, I followed the quickstart.
dce init
followed by dce deploy
. After that I copied the gateway url to .dce.yaml
, set up the child account, created a dce account using the child account.
My current plan is to try to manually remove everything at aws (except the child account), recompile dce, then start again. If there is a better way to set it up in order to make updates in place, I'll do that.
Hi, @michaelpetersubc. Un-deployment is a limitation in the versions of the dce
CLI v0.3.1 and below. We have a solution for destroying resources deployed by the newer versions of dce
in the process of being released.
In the meantime, we are putting a solution together to un-deploy the DCE resources created by these earlier versions. We will share that with you as soon as we have it.
Thanks @nathanagood, that answers my question. I'll watch for the un-deploy feature, in the meantime, I'll try to resolve it manually, might learn something.
Hi @michaelpetersubc, we’ve put together an example script for deleting resources from older versions of DCE. It will delete any resources tagged AppName=DCE
or with identifiers containing the unique namespace that DCE attaches to everything it deploys. Here are the steps for using it:
accounts-dce-{namespace}
account-reset-codebuild-dce-{namespace}
populate-reset-queue-dce-{namespace}
./delete-dce.sh [region] [namespace]
. It is important to use the correct namespace, as miscellaneous resources containing this substring will be deleted. For example, if you type ./delete-dce.sh us-east-1 inst
, then miscellaneous resources containing the substring inst
in their arn/name might be deleted.Running bulk delete operations against an AWS account is risky, particularly when you have other things in the account that you don’t want deleted. We recommend reading through this script or using it as a guide if you have concerns about accidentally deleting other resources in your account.
As @nathanagood mentioned, dce-cli version v0.4.0 supports deleting dce via locally cached terraform state file, binary, and backend configuration. Here’s how it’s done:
# change into the directory containing the terraform binary dce-cli used for deployment
cd ~/.dce/.cache/terraform/0.12.18
# inititalize terraform using the cached main.tf
./terraform init ~/.dce/.cache/module
# run terraform destroy using the cached main.tf
./terraform destroy ~/.dce/.cache/module
We recommend deleting the ~/.dce
configuration directory and starting over from dce init
if you would like to redeploy dce after destroying it via this method.
We haven’t created a cli command to make this convenient yet. Our goal is to provide convenient mechanisms for deploying, upgrading, and deleting dce. This is an iteration towards that goal, and feedback such as yours helps tremendously in guiding our design. Please let us know if you need any more help.
@joshmarsh , @nathanagood
Thanks, the script for the older versions works well, seems to have cleaned up everything, including a botched installation that I had partially cleaned up.
Version 0.4.0 works as advertised, now to try it for real. The application is for managing grad students computations.
Your documentation is now slightly inconsistent with the new version. For example, the .dce.yaml
file is no longer used and the deploy script sets the api gateway url automatically.
@joshmarsh , @nathanagood Thanks, the script for the older versions works well, seems to have cleaned up everything, including a botched installation that I had partially cleaned up. Version 0.4.0 works as advertised, now to try it for real. The application is for managing grad students computations. Your documentation is now slightly inconsistent with the new version. For example, the
.dce.yaml
file is no longer used and the deploy script sets the api gateway url automatically.
Thanks for the feedback @michaelpetersubc. Looks like we missed a few places when we updated the docs last. We'll get on that soon.
@joshmarsh , @nathanagood I have to step back one, it appears the default setting for aws-nuke is dry run, so when a lease ends the account is not cleared
020/02/12 17:23:46 INFO: Nuke is set in Dry Run mode and will not remove any resources and cannot set back the state of the DCE child account Please set 'RESET_NUKE_DRY_RUN' to not 'true' to exit Dry Run mode.
which isn't as advertised (the docs say you can reset it to dry run using terraform). The error message gives a sort of sensible fix, but honestly I can't figure out how to implement the fix. I deployed with dce not terraform. Is there a way I can manually reset dry run without a redeploy?
Hi @michaelpetersubc -- just want to let you know that I'm looking into this. I should have some useful info for you later today.
Ok @michaelpetersubc, I think I can help you manually reconfigure your DCE deployment to enable aws-nuke to run in --no-dry-run
mode.
account-reset-<namespace>
project, where <namespace>
is the namespace you used to deploy DCE (or a random ID)RESET_NUKE_TOGGLE = false
. Change this value to true
.Subsequent account reset jobs should run aws-nuke in --no-dry-run
mode.
Please let me know if you run into any problems with this, or if it doesn't work as expected.
For added context, DCE v0.24.0 introduced a change to enable aws-nuke to run in --no-dry-run
mode by default. The latest version of dce-cli is still tied to DCE v0.23.0, so it does not include this change.
We are working on a new release of dce-cli, to upgrade to the latest version of DCE. We also have plans to support additional deployment options, to make it easier to configure these types of parameters.
@eschwartz Thanks for the very clear instruction, that worked, and I would never have guessed how to do that. The reset now works and removes everything created while the lease is being used.
However, the reset also removes the admin permission from the trusted role as it did before, so the child account is never returned to the ready state. This is the same problem that started all this - maybe that problem wasn't fixed initially - or at least I haven't managed to install the right version of the updated software. I believe that problem is an issue with dce not with dce-cli which is what I updated.
Hey @michaelpetersubc -- I apologize, I didn't realize this was still related to #231, but I see that now.
It looks like we got a fix out for #231 (PR #232), but my guess would be that dce-cli is still using a version of DCE that's older than that release.
Give me a minute to look into exactly what's happening, and I'll get back to you.
Seems like the most straightforward solution here is to get a new release of dce-cli out the door. We'll be working on that this week, and we'll keep you updated.
Hey @michaelpetersubc I'm actively working on this dce-cli upgrade. I'm also looking into making the dce version used by dce system deploy
configurable, so this type of "patching" is easier in the future.
Just so we're not a blocker for you, I want to point out that we do have an alternate path for deploying DCE without using the dce cli: https://dce.readthedocs.io/en/latest/terraform.html
Hey @michaelpetersubc wondering if you're still waiting on this. I apologize that we've been spread a little thin lately, and haven't been as responsive as I'd like to be.
I am actually still working getting the CLI upgrade out. We've had a couple major blockers since we last talked, that held this up.
But I might actually steer you in another direction -- take a look at deploying DCE with terraform directly. It will give you more control over you environment, which you'll likely need eventually anyways:
@eschwartz Thanks. You are a health company, I did guess you might have higher priority things to do. A have figured out a partial substitute for dce using terraform. I also realized that your suggestion is likely right, do it with terraform. The documentation for terraform wasn't quite as straightforward as it was for the CLI, so I put it off (probably for the same kind of reasons you did) until I get more experience with terraform.
You probably appreciate the feedback anyway, so I'll get back to you with whatever problems I find. Could you leave this thread open so I can reference it later when I need it?
Is your feature request related to a problem? Please describe. To clean up after a bug I need to recompile the dce software. It appears that to get things right I would have to redeploy (to change scripts running on aws).
As far as I can see, deploying again recreates everything. For example, the api gateway would change.
While I am testing I would just like to remove everything that was installed by the first deploy, then deploy it again with the new software. Even a list in the docs that describes what to remove manually would help. More generally a suggestion about how to go about repairing bugs or upgrading the software would be nice.
However in production, all the users of the service would seem to have to adjust their settings (the gateway url for example) to cope with an upgrade. So a feature that updates the software without changing basic setting or duplicating scripts at aws would help.
Describe the solution you'd like
Describe alternatives you've considered
Additional context