databricks / terraform-provider-databricks

Databricks Terraform Provider
https://registry.terraform.io/providers/databricks/databricks/latest
Other
457 stars 393 forks source link

[ISSUE] Throttling Error while Importing resources #2127

Open rishabhtrivedi23 opened 1 year ago

rishabhtrivedi23 commented 1 year ago

Configuration

# Copy-paste your Terraform configuration here
main.tf:
module "adgroupmember" {
  for_each        = var.acgroups
  source          = "../../../modules/databricks/unity/accessmanagement/groupmembership"
  acgroup         = each.key
  adobjectid      = var.adobjectid
  acgroups_member = each.value.ad_group_member
  member_type     = "ad"
  depends_on = [
    module.creategroups
  ]
}

module.tf : 
terraform {
  required_providers {
    databricks = {
      source  = "databricks/databricks"
      version = "1.13.0"
    }
  }
}
resource "databricks_group_member" "ad_group_member" {
    for_each = {for i in toset(var.acgroups_member): i => i if var.member_type == "ad"}
    group_id = data.databricks_group.acgroup.id
    member_id = lookup(var.adobjectid, each.value)
}

datasource.tf (in module)
data "databricks_group" "acgroup" {
    display_name = var.acgroup
}

Expected Behavior

Operations should be running fine.

Actual Behavior

Error: inner token: token error: {"error":"invalid_request","error_description":"Temporarily throttled, too many requests"} _│ │ with module.adgroupmember["GROUP_NAME"].data.databricks_group.acgroup, │ on ......\modules\databricks\unity\accessmanagement\groupmembership\datasources.tf line 1, in data "databricks_group" "acgroup": │ 1: data "databricksgroup" "acgroup" { │ ╵

Steps to Reproduce

  1. Terraform import an existing group from account console

Terraform and provider versions

_terraform { requiredproviders { databricks = { source = "databricks/databricks" version = "1.13.0" } } _provider "databricks" { azure_workspace_resource_id = var.workspace_id azure_use_msi = true host = var.databricks_host account_id = var.account_id http_timeout_seconds = 300 rate_limit = 500 #(default is 15) debugheaders = true }

Debug Output

Important Factoids

TakeshiMatsukura commented 1 year ago

Do you consistently see the issue? Does terraform apply -parallelism=1 mitigate it?

rishabhtrivedi23 commented 1 year ago

@TakeshiMatsukura this option is applicable for terraform apply, I am trying to do terraform import for existing groups created on account console.

rishabhtrivedi23 commented 1 year ago

I was able to fix this by using rate_limit=5 in the provider file. However, it will slow down the process.

TakeshiMatsukura commented 1 year ago

Oh, okay. What occurs when you revert "rate_limit"? Since there is the rate limit on any REST API calls, I don't think 500 works. https://docs.databricks.com/dev-tools/api/index.html#rate-limits

TakeshiMatsukura commented 1 year ago

if there is no one who uses Accounts SCIM REST API on your account in parallel, it can be increased to 25. But 500 will not work.

rishabhtrivedi23 commented 1 year ago

initially, I didn't give any rate_limit and the default is 15, but it was failing intermittently even with default value. Reducing it to 5 atleast is not giving any throttling error.

TakeshiMatsukura commented 1 year ago

Since Accounts SCIM REST API can be accessed by other users and some UI operations, it could be the cause. Unfortunately Databricks cannot increase the limit for a specific account.

rishabhtrivedi23 commented 1 year ago

I understand that but databricks can limit the number of calls being made to get the user/group information. Similar to how sssd service works in a linux environment to get user/group info from active directory.

plamb commented 1 year ago

We are seeing the same throttling issue when using the 1.13.0 provider, reverting back to 1.11.0 we do not have the throttling error.

rishabhtrivedi23 commented 1 year ago

I tried with 1.11.0 and got the same issue so it doesn't work for me in older version.