digitalocean / terraform-provider-digitalocean

Terraform DigitalOcean provider
https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs
Mozilla Public License 2.0
502 stars 268 forks source link

digitalocean_database_connection_pool for_each 422 error #925

Open SkeLLLa opened 1 year ago

SkeLLLa commented 1 year ago

Bug Report

Describe the bug

While creating DB pools DO API returns error, but the actual resource is created. In current example 2 DB pools should be created.

Terraform issues 2 requests with different POST data to create 2 env polls pear each value of var.environemts, e.g. [test, stage]. And one of requests is completed with 201 code. Other fails with 422.

422 (request "35ea9e11-9563-4abc-bfc8-882b88bf4923") pool name is not available

While pool is actually created successfully with given name.

Affected Resource(s)

Expected Behavior

Pool is created and no error thrown

Actual Behavior

Pool is created, but 422 error thrown which leads to 422 exit code.

Steps to Reproduce

Terraform Configuration Files

resource "digitalocean_database_connection_pool" "foo_db_pool" {
  for_each   = var.environments
  name       = "foo_${each.value}"
  cluster_id = var.cluster_id
  mode       = var.pool_type
  size       = var.pool_size
  depends_on = [
    digitalocean_database_db.foo_db,
    digitalocean_database_user.foo_db_user
  ]

  db_name = digitalocean_database_db.foo_db[each.value].name
  user = digitalocean_database_user.foo_db_user[each.value].name
}

Terraform version 1.2.X 1.3.X

danaelhe commented 1 year ago

Hi there,

Thank you so much for this write up and including the request id. We were able to look up the logs via the request id and found that this seems to be a standard name conflict. It seems the plan is failing because one of the two pool names specified already existed. This surfaces that the error message could have been more clear and possibly a more appropriate response code (more like a 409) could have been returned. We'll discuss as a team how we can surface a clearer error up to the user in this kind of situation.

Thank you so much, Dana

SkeLLLa commented 1 year ago

@danaelhe The problem is that those pool names are unique. And if they exist in state it shouldn't produce any error. If they don't then they should be created successfully.

It more looks like that digitalocean API return inconsistent responses and therefore provider thinks that some resources don't exist or should be recreated. Since in terraform debug logs I often find Provider produced inconsistent result after apply on digitalocean provider resources. Also it may happen because of recreate of a resource.

Maybe it's connected with #493

So I think best option here is to create a proper idempotent API and use PUT instead of POST in provider. Or it should do proper check if given pool exists (by name).

SkeLLLa commented 1 year ago

I've also found one more bug with this. Sometimes DO send 503 error code for example, while resource is created. And on next try it fails with 422 error, since it was created despite error.

So the provider should be fixed to handle such states.

andrewsomething commented 1 year ago

@SkeLLLa Can you show any additional info on how you've defined the environments? I've been unsuccessful at reproducing it using:

variable "environments" {
  type    = set(string)
  default = ["test", "stage"]
}

resource "digitalocean_database_cluster" "pg" {
  name       = "cluster-pg-test"
  engine     = "pg"
  version    = "15"
  size       = "db-s-1vcpu-2gb"
  region     = "nyc3"
  node_count = 1
}

resource "digitalocean_database_connection_pool" "foo_db_pool" {
  for_each   = var.environments
  name       = "pool_${each.value}"
  cluster_id = digitalocean_database_cluster.pg.id
  mode       = "transaction"
  size       = 5
  db_name    = "defaultdb"
  user       = "doadmin"
}

Are the number of pools and their names correct in the plan?

SkeLLLa commented 1 year ago

@andrewsomething yes, that's correct. But it not happen every time, just as bug described in #493 or, for example, when 5XX error appear. But for me it never appeared for resources that don't use for_each template for some reason. I'd say once in 50 times.

And the issue it that provider couldn't properly handle such errors or undefined state after 5XX error. So, for example, when I've got the issue last time it was like that:

  1. Provider's request to create a DB https://api.digitalocean.com/v2/databases/***/pools failed with 503 error.
  2. While in DO the pool was actually created.
  3. In the terraform state the resource was marked as not created because of 503 error.
  4. On next retry creation of the pool failed, since the name was already taken.

So it would be nice if in provider we will have an option like skipIfExsits and extra check with get request or have pool creation request idempotent, so it will not fail on retry.