DopplerHQ / terraform-provider-doppler

Apache License 2.0
23 stars 10 forks source link

BUG: Error creating AWS Parameter Store Secret sync #63

Open xPiSeaZx opened 10 months ago

xPiSeaZx commented 10 months ago

When trying to create an AWS Parameter Store Secrets Sync resource with multiple environments, the creation errors out with the following error message:

module.aws_ps_syncs["ap_southeast_2"].doppler_secrets_sync_aws_parameter_store.this[0]: Creating...
module.aws_ps_syncs["ap_southeast_2"].doppler_secrets_sync_aws_parameter_store.this[0]: Still creating... [10s elapsed]
╷
│ Error: Doppler Error: Unable to determine initial sync status, the operation might just be taking a long time. Please check the Doppler dashboard for more information.
│
│   with module.aws_ps_syncs["ap_southeast_2"].doppler_secrets_sync_aws_parameter_store.this[0],
│   on ../../../modules/secrets_syncs/aws_parameter_store/main.tf line 1, in resource "doppler_secrets_sync_aws_parameter_store" "this":
│    1: resource "doppler_secrets_sync_aws_parameter_store" "this" {
│

The sync gets created in Doppler and shows in sync, but the Terraform state file does not get updated so Terraform wants to create the resource again, which fails because the sync already exists:

module.aws_ps_syncs["ap_southeast_2"].doppler_secrets_sync_aws_parameter_store.this[0]: Creating...
╷
│ Error: Doppler Error: There is already an existing integration for this setup.
│
│   with module.aws_ps_syncs["ap_southeast_2"].doppler_secrets_sync_aws_parameter_store.this[0],
│   on ../../../modules/secrets_syncs/aws_parameter_store/main.tf line 1, in resource "doppler_secrets_sync_aws_parameter_store" "this":
│    1: resource "doppler_secrets_sync_aws_parameter_store" "this" {
│

Increasing the sync check time should resolve the issue or perhaps adding a timeout function (https://developer.hashicorp.com/terraform/language/resources/syntax#operation-timeouts).

nmanoogian commented 10 months ago

Thanks for reporting this! Do you know roughly how long you were waiting for that sync to be created?

xPiSeaZx commented 10 months ago

@nmanoogian Terraform apply errored out after about 10-12 seconds, the sync itself completed around 5 or so seconds later (16 or seconds in total for sync to complete).

nmanoogian commented 10 months ago

@xPiSeaZx OK, perfect -- thanks for sharing that. The sync creation endpoint waits ~15sec for the initial sync to finish before throwing that timeout error so it was probably pretty close, unfortunately. I'll chat with the team to figure out a good solution. It definitely doesn't make sense for the terraform apply to fail when the sync was indeed created successfully

sbrudz commented 6 months ago

@nmanoogian I'm hitting this error sometimes, too, when creating secrets syncs for Terraform Cloud. I've seen it take 30 seconds or more to complete a sync. It's a pain when it does happen because the manual fix is to delete the problem secret syncs via the UI and then keep retrying the terraform run until they sync fast enough to pass. A better manual fix would be to do an import, but that's not yet possible (per #64).

Has there been any progress towards a solution?

nmanoogian commented 6 months ago

Thanks for weighing in, @sbrudz. We're going to roll back the change which forces Terraform to wait for syncs to be created. The behavior was added to make sync creation more resilient but it really just swapped one issue out for another. The sync resilience problem is better solved on the backend. I'll keep this GH issue updated with progress.

sbrudz commented 6 months ago

That change works for me. I have downstream dependencies that rely on the sync being completed, but I can handle that by writing a local provisioner that polls the Doppler API until the sync is done.

Thanks for the update!

nmanoogian commented 6 months ago

That change works for me. I have downstream dependencies that rely on the sync being completed, but I can handle that by writing a local provisioner that polls the Doppler API until the sync is done.

Hmmm, that makes me think that the await behavior is desirable or should be configurable? Today it's implemented by blocking the actual creation request but I think it would make more sense to poll GET /v3/configs/config/syncs/sync?project=:slug&config=:slug&sync=:slug as you are describing

sbrudz commented 6 months ago

the await behavior is desirable or should be configurable

It is definitely desirable behavior. Supporting a timeout block as @xPiSeaZx suggested would be a good option, as long as the resource handles a timeout gracefully and the TF state is kept in sync with Doppler.

Another idea would be to add a separate resource that would poll the API and return when the initial sync has completed. I'm envisioning something similar to tfe_workspace_run but much simpler. That would decouple sync creation from the initial sync process and minimize the need to change existing code (adding support for setting await_initial_sync would probably be enough).

nmanoogian commented 6 months ago

Gotcha, that makes sense. Let's start with the timeout block. It can poll under the hood and when it expires, it'll post a warning and continue with graceful creation 👍

sbrudz commented 6 months ago

Sounds good -- thank you!

nmanoogian commented 6 months ago

Whoops! We had this issue linked with the ticket to revert the await_initial_sync change and that just shipped. Reopening this