auth0 / terraform-provider-auth0

The Auth0 Terraform Provider is the official plugin for managing Auth0 tenant configuration through the Terraform tool.
https://registry.terraform.io/providers/auth0/auth0/latest/docs
Mozilla Public License 2.0
166 stars 83 forks source link

Applying auth0_user and auth0_organization_member produced inconsistent results after apply #1007

Open nbrown-cmx opened 2 months ago

nbrown-cmx commented 2 months ago

Checklist

Description

When creating an auth0_user and related auth0_organization_member in the same apply step, we always get the following error.

module.users.auth0_organization_member.main: Creating...
╷
│ Error: Provider produced inconsistent result after apply
│
│ When applying changes to module.users.auth0_organization_member.main, 
│  provider "provider[\"registry.terraform.io/auth0/auth0\"]" produced an 
│  unexpected new value: Root object was present, but now absent.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵

If we retry the apply after the error, the operation succeeds.

Also, if we run apply with just the user, then run apply for the organization member both operations succeed.

Expectation

That the auth0_user and related auth0_organization_member would be created successfully on the first apply.

Reproduction

Given an Auth0 tenant with an M2M application used by the terraform client, the default Auth0 database, and an existing organization, run apply on the following code:

terraform {
  required_providers {
    auth0 = {
      source  = "auth0/auth0"
      version = "~> 1.4.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.6.2"
    }
  }
  required_version = ">= 1.9.5"
}

provider "auth0" {
  domain        = var.auth0_client_credentials["domain"]
  client_id     = var.auth0_client_credentials["client_id"]
  client_secret = var.auth0_client_credentials["client_secret"]
}

variable "auth0_client_credentials" {
  type = object({
    domain = string
    client_id = string
    client_secret = string
  })
  sensitive = true
}

variable "email" {
  type = string
}
variable "username" {
  type = string
}
variable "name" {
  type = string
}

variable "auth0_organization_id" {
  type = string
}
variable "auth0_connection_name" {
  type    = string
  default = "Username-Password-Authentication"
}

resource "random_password" "initial_user_password" {
  length      = 16
  min_upper   = 1
  min_lower   = 1
  min_numeric = 1
  min_special = 1
}

resource "auth0_user" "main" {
  username       = var.username
  name           = var.name
  email          = var.email
  email_verified = true

  connection_name = var.auth0_connection_name
  password        = random_password.initial_user_password.result

  lifecycle {
    ignore_changes = [
      password
    ]
  }
}

resource "auth0_organization_member" "main" {
  organization_id = var.auth0_organization_id
  user_id         = auth0_user.main.user_id
}

Auth0 Terraform Provider version

1.4.0

Terraform version

1.9.5

duedares-rvj commented 1 month ago

Hello, I believe we should add a depends_on block here.

resource "auth0_organization_member" "main" { depends_on : [resource.auth0_user.main] organization_id = var.auth0_organization_id user_id = auth0_user.main.user_id }

This will allow creation of auth0_user resource first and then move to creating auth0_organization_member and avoid deadlock.

Let us know if that works out. Thanks

nbrown-cmx commented 1 month ago

Thank you for the suggestion. We've tried doing that but it does not fix the bug. Based on my own understanding, I believe this makes sense; the addition of depends_on isn't necessary since the dependency on auth0_user.main is already explicit in the assignment of auth0_organization_member.main.user_id. In both cases, the terraform apply logs show the user is fully created before the organization_member creation is started.

nbrown-cmx commented 1 month ago

My own continued research on finding a resolution for this has ultimately boiled down to this comment. In a nutshell, the likely problem is either eventual consistency or an api response within the provider during the user apply is not making this possible.

My suspicion is currently on what's returned by the user creation (or a related GET after the user creation). It may not be returning the resource or an id that is necessary to determine how to build the organization member.

cdsre commented 1 month ago

I can confirm I can reproduce this consistently. I am not sure how the Auth0 API is implemented in the backend but it feels like it may be asynchronous and while creating the user returns a user ID the user it not actually fully provisioned. I have added some debug and error logs to the OrganizationMemeber code in my local copy of the provider.

When I run the apply I can see the organization member is created and the state ID is set using the org_id and user_id. After creating the resources its always good practice to read it back which most providers do and auth0 is no different. However when the read call happens it fetches all the members of the organization and then loops through them to check if the user_id is one of them.

However after looping through the list the user_id is not found. The logic in the readOrganizationMember after looping all users of the org and not finding the user will set the state ID as an empty string.


2024-09-14T16:01:09.140+0100 [INFO]  Starting apply for auth0_organization_member.main
2024-09-14T16:01:09.140+0100 [DEBUG] auth0_organization_member.main: applying the planned Create change
2024-09-14T16:01:09.258+0100 [DEBUG] provider.terraform-provider-auth0.exe: Created organization member: @caller=C:/Projects/GoLand/terraform-provider-auth0/internal/auth0/organization/resource_member.g
o:51 tf_req_id=ca82ed12-da4f-2a13-dd46-014cd7882ceb tf_resource_type=auth0_organization_member tf_rpc=ApplyResourceChange #ID#=org_VIKaWMhqq4ltJJQQ::auth0|66e5a53443c8b7c3fabf1ff4 @module=provider tf_provider_addr=provider timestamp="2024-09-14T16:01:09.258+0100"
2024-09-14T16:01:09.491+0100 [ERROR] provider.terraform-provider-auth0.exe: Member not found so ID will be set to empty string: #ID#=org_VIKaWMhqq4ltJJQQ::auth0|66e5a53443c8b7c3fabf1ff4 @module=provider
 tf_req_id=ca82ed12-da4f-2a13-dd46-014cd7882ceb tf_rpc=ApplyResourceChange @caller=C:/Projects/GoLand/terraform-provider-auth0/internal/auth0/organization/resource_member.go:77 tf_provider_addr=provider tf_resource_type=auth0_organization_member timestamp="2024-09-14T16:01:09.490+0100"
2024-09-14T16:01:09.492+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-09-14T16:01:09.492+0100 [ERROR] vertex "auth0_organization_member.main" error: Provider produced inconsistent result after apply

Setting a ResourceData id as an empty string effectively removes it from the state. There for the create adds an ID to the ResourceData then calls the read which sets the ID to empty string which removed it again. This is why you get the error that the provider produced an inconsistent result. and that the object was present and is now absent.

When you re-run the apply it will trigger a create again which I believe is idempotent on the auth0 side so it will look like it created the user even though the user already existed from the last apply. when the create then calls the read the read will fetch all the organization members again and this time find the user since it was actually created in the previous apply.

I have noticed this a few times in several places in the auth0 provider. I would suggest the auth0 team should add some logic in the read to use something like adding a sleep or a retry loop if the resource is a new resource since there is a delay in the create completing but the user being in the organizations member list.

func readOrganizationMember(ctx context.Context, data *schema.ResourceData, meta interface{}) diag.Diagnostics {
    api := meta.(*config.Config).GetAPI()

    organizationID := data.Get("organization_id").(string)

        // If this is a new resource sleep for 1 sync to give it time to be linked to the org
    if data.IsNewResource() {
        time.Sleep(1 * time.Second)
    }
    members, err := fetchAllOrganizationMembers(ctx, api, organizationID)
    if err != nil {
        return diag.FromErr(internalError.HandleAPIError(data, err))
    }
...
...
cdsre commented 1 month ago

As a mitigation for the OP you can make use of terraforms time providers sleep resource to implement this same sleep but within your control in terraform.

resource "auth0_user" "user" {
    connection_name = "local-dev-test-org-internal-users"
    email           = "test@test.com"
    email_verified  = true
    password        = "passpass$12$12"
}

resource "time_sleep" "async_wait" {
    create_duration = "3s"
    depends_on = [auth0_user.user]
}
resource "auth0_organization_member" "main" {
    organization_id = var.auth0_organization_id
    user_id         = auth0_user.user.user_id
    depends_on = [time_sleep.async_wait]
}

This will delay the time between the resources allowing the user resource to be fully created in the auth0 backend


Plan: 3 to add, 0 to change, 0 to destroy.
auth0_user.user: Creating...
auth0_user.user: Creation complete after 1s [id=auth0|66e5acd8780b663974a14b36]
time_sleep.async_wait: Creating...
time_sleep.async_wait: Creation complete after 3s [id=2024-09-14T15:33:48Z]
auth0_organization_member.main: Creating...
auth0_organization_member.main: Creation complete after 0s [id=org_VIKXXXXXXXXXXQQ::auth0|66e5XXXXXXXXX4a14b36]

that should at least unblock you until the team can fix the provider.

nbrown-cmx commented 1 month ago

As a mitigation for the OP you can make use of terraforms time providers sleep resource to implement this same sleep but within your control in terraform.

resource "auth0_user" "user" {
    connection_name = "local-dev-test-org-internal-users"
    email           = "test@test.com"
    email_verified  = true
    password        = "passpass$12$12"
}

resource "time_sleep" "async_wait" {
    create_duration = "3s"
    depends_on = [auth0_user.user]
}
resource "auth0_organization_member" "main" {
    organization_id = var.auth0_organization_id
    user_id         = auth0_user.user.user_id
    depends_on = [time_sleep.async_wait]
}

This will delay the time between the resources allowing the user resource to be fully created in the auth0 backend


Plan: 3 to add, 0 to change, 0 to destroy.
auth0_user.user: Creating...
auth0_user.user: Creation complete after 1s [id=auth0|66e5acd8780b663974a14b36]
time_sleep.async_wait: Creating...
time_sleep.async_wait: Creation complete after 3s [id=2024-09-14T15:33:48Z]
auth0_organization_member.main: Creating...
auth0_organization_member.main: Creation complete after 0s [id=org_VIKXXXXXXXXXXQQ::auth0|66e5XXXXXXXXX4a14b36]

that should at least unblock you until the team can fix the provider.

I'm confused how this helps. Your log from running it seems to imply it does, but I'm hoping someone can clarify why. My misunderstanding is on where the async wait is taking place. If the issue is eventual consistency between creating the organization_member and being able to retrieve it from the list of members on the organization, how would waiting after the user creation but before the organization_member creation help? Wouldn't you need to wait inbetween the organization_member creation and the post-creation fetch of that same organization member?