Terraform helm provider in extremely slow to read helm_releases when used more

pranavnateri commented 1 year ago

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: Terraform v1.4.6
Provider version: ~> 2.6
Kubernetes version: 1.24

Affected Resource(s)

helm_release

Terraform Configuration Files

Below for example. i have posted a few.. but put around 40 helm releases and check. A single variable change in tfvars to apply, the provider just takes its own time to refresh, eventually to just timeout.

resource "helm_release" "kubernetes-dashboard" {
  provider  = helm.custom
  timeout = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-dashboard"
  namespace = var.kubernetes_dashboard
  chart     = "${path.module}/helm-charts-application/helm-kubernetes-dashboard"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-kubernetes-dashboard/values.yaml",
      {
        kubernetes_dashboard              = var.kubernetes_dashboard,
        kubernetes_dashboard_dns          = "${local.env_selected}-${var.service_name}-${var.region}-dashboard.${var.dnsZones1}",
        kubernetes_dashboard_service_type = var.kubernetes_dashboard_service_type,
        kubernetes_dashboard_csrf         = var.kubernetes_dashboard_csrf,
        kubernetes_dashboard_key_holder   = var.kubernetes_dashboard_key_holder,
        kubernetes_dashboard_settings     = var.kubernetes_dashboard_settings,
        kubernetes_dashboard_certs        = var.kubernetes_dashboard_certs,
        dashboard_metrics_scraper         = var.dashboard_metrics_scraper,
        admin_user_token                  = var.admin_user_token,
        public_subnets                    = local.public_subnets
    })
  ]
}

##########################  HELM RELEASE FOR API GATEWAY SERVICE   ###########################################
resource "helm_release" "api-gateway" {
  provider  = helm.custom
  timeout = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-apigw"
  namespace = local.env_selected
  chart     = "${path.module}/helm-charts-application/helm-api-gateway-service"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-api-gateway-service/values.yaml",
      {
        api_gateway_min_replicas                   = var.api_gateway_min_replicas
        api_gateway_max_replicas                   = var.api_gateway_max_replicas
        api_gateway_targetcpuutilizationpercentage = var.api_gateway_targetcpuutilizationpercentage
        api_gateway_hosts_list                     = var.api_gateway_hosts_list
        #api_gateway_cors_allowedorigins            = var.api_gateway_cors_allowedorigins
        api_gateway_to_authservice_host         = var.api_gateway_to_authservice_host
        api_gateway_to_accountservice_host      = var.api_gateway_to_accountservice_host
        api_gateway_to_agreementservice_host    = var.api_gateway_to_agreementservice_host
        api_gateway_to_signupservice_host       = var.api_gateway_to_signupservice_host
        api_gateway_to_downloadservice_host     = var.api_gateway_to_downloadservice_host
        api_gateway_to_historyservice_host      = var.api_gateway_to_historyservice_host
        api_gateway_to_usageservice_host        = var.api_gateway_to_usageservice_host
        api_gateway_to_billingservice_host      = var.api_gateway_to_billingservice_host
        api_gateway_to_resellerservice_host     = var.api_gateway_to_resellerservice_host
        api_gateway_to_platformkeyservice_host  = var.api_gateway_to_platformkeyservice_host
        api_gateway_to_operatorkeyservice_host  = var.api_gateway_to_operatorkeyservice_host
        api_gateway_authcookie_requirehttps     = var.api_gateway_authcookie_requirehttps
        api_gateway_authcookie_encryption_salt  = var.api_gateway_authcookie_encryption_salt
        api_gateway_authcookie_domain           = var.api_gateway_authcookie_domain
        api_gateway_health_authentication_keys  = var.api_gateway_health_authentication_keys
        api_gateway_to_apikeyservice_host       = var.api_gateway_to_apikeyservice_host
        api_gateway_to_subscriptionservice_host = var.api_gateway_to_subscriptionservice_host
        api_gateway_image                       = var.api_gateway_image
        api_gateway_image_name                  = var.api_gateway_image_name
        api_gateway_ns                          = var.api_gateway_ns
        api_gateway_to_reportsservice_host      = var.api_gateway_to_reportsservice_host
    })
  ]
}

##########################  HELM RELEASE FOR AUTH SERVICE   ###########################################
resource "helm_release" "auth-service" {
  provider  = helm.custom
  timeout   = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-auth"
  namespace = local.env_selected
  chart     = "${path.module}/helm-charts-application/helm-auth-service"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-auth-service/values.yaml",
      {
        auth_service_ns                                                    = var.auth_service_ns
        auth_service_image_name                                            = var.auth_service_image_name
        auth_service_image                                                 = var.auth_service_image
        auth_service_min_replicas                                          = var.auth_service_min_replicas
        auth_service_max_replicas                                          = var.auth_service_max_replicas
        auth_service_targetCPUUtilizationPercentage                        = var.auth_service_targetCPUUtilizationPercentage
        auth_service_amazon_aws_region                                     = var.auth_service_amazon_aws_region
        auth_service_to_signupservice_host                                 = var.auth_service_to_signupservice_host
        auth_service_to_accountservice_host                                = var.auth_service_to_accountservice_host
        auth_service_to_billingservice_host                                = var.auth_service_to_billingservice_host
        auth_service_to_emailservice_host                                  = var.auth_service_to_emailservice_host
        auth_service_to_historyservice_host                                = var.auth_service_to_historyservice_host
        auth_service_to_resellerservice_host                               = var.auth_service_to_resellerservice_host
        auth_service_to_portalservice_host                                 = var.auth_service_to_portalservice_host
        auth_service_to_apigateway_host                                    = var.auth_service_to_apigateway_host
        auth_service_emails_verificationemails_sender_address              = var.auth_service_emails_verificationemails_sender_address
        auth_service_emails_passwordresetcodeemails_sender_address         = var.auth_service_emails_passwordresetcodeemails_sender_address
        auth_service_emails_passwordchangedemails_sender_address           = var.auth_service_emails_passwordchangedemails_sender_address
        auth_service_emails_userdetailsupdatedemails_sender_address        = var.auth_service_emails_userdetailsupdatedemails_sender_address
        auth_service_emails_userdetailsupdatedemails_to_addresses          = var.auth_service_emails_userdetailsupdatedemails_to_addresses
        auth_service_emails_createpasswordandactivateemails_sender_address = var.auth_service_emails_createpasswordandactivateemails_sender_address
        auth_service_emails_migrationemails_sender_address                 = var.auth_service_emails_migrationemails_sender_address
        auth_service_security_jwt_refreshtoken_validityperiodinseconds     = var.auth_service_security_jwt_refreshtoken_validityperiodinseconds
        auth_service_amazon_dynamodb_tableprefixenv                        = var.auth_service_amazon_dynamodb_tableprefixenv
    })
  ]
}

##########################  HELM RELEASE FOR SIGNUP SERVICE   ###########################################
resource "helm_release" "signup-service" {
  provider  = helm.custom
  timeout = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-signup"
  namespace = local.env_selected
  chart     = "${path.module}/helm-charts-application/helm-signup-service"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-signup-service/values.yaml",
      {
        signup_service_ns                                              = var.signup_service_ns
        signup_service_image_name                                      = var.signup_service_image_name
        signup_service_image                                           = var.signup_service_image
        signup_service_amazon_aws_region                               = var.signup_service_amazon_aws_region
        signup_service_authservice_host                                = var.signup_service_authservice_host
        signup_service_accountservice_host                             = var.signup_service_accountservice_host
        signup_service_emailservice_host                               = var.signup_service_emailservice_host
        signup_service_historyservice_host                             = var.signup_service_historyservice_host
        signup_service_billingservice_host                             = var.signup_service_billingservice_host
        signup_service_resellerservice_host                            = var.signup_service_resellerservice_host
        signup_service_agreementservice_host                           = var.signup_service_agreementservice_host
        signup_service_emails_rejectedsignupemails_sender_address      = var.signup_service_emails_rejectedsignupemails_sender_address
        signup_service_emails_rejectedsignupemails_to_addresses        = var.signup_service_emails_rejectedsignupemails_to_addresses
        signup_service_salesforce_url                                  = var.signup_service_salesforce_url
        signup_service_salesforce_oid                                  = var.signup_service_salesforce_oid
        signup_service_geoipcredentials_ipcountryserviceuser           = var.signup_service_geoipcredentials_ipcountryserviceuser
        signup_service_signupapprovals_restrictbyexportscreening       = var.signup_service_signupapprovals_restrictbyexportscreening
        signup_service_usgovernmentconsolidatedscreeninglistapi_apikey = var.signup_service_usgovernmentconsolidatedscreeninglistapi_apikey
        signup_service_amazon_dynamodb_tableprefixenv                  = var.signup_service_amazon_dynamodb_tableprefixenv
        signup_service_min_replicas                                    = var.signup_service_min_replicas
        signup_service_max_replicas                                    = var.signup_service_max_replicas
        signup_service_targetcpuutilizationpercentage                  = var.signup_service_targetcpuutilizationpercentage
    })
  ]
}

##########################  HELM RELEASE FOR API KEY SERVICE   ###########################################
resource "helm_release" "api-key-service" {
  provider  = helm.custom
  timeout = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-apikey"
  namespace = local.env_selected
  chart     = "${path.module}/helm-charts-application/helm-api-key-service"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-api-key-service/values.yaml",
      {
        api_key_service_amazon_aws_region                  = var.api_key_service_amazon_aws_region
        api_key_service_historyservice_host                = var.api_key_service_historyservice_host
        api_key_service_amazon_dynamodb_tableprefixenv     = var.api_key_service_amazon_dynamodb_tableprefixenv
        api_key_service_logging_level_com_exp              = var.api_key_service_logging_level_com_exp
        api_key_service_apigateway_host                    = var.api_key_service_apigateway_host
        api_key_service_subscriptionservice_host           = var.api_key_service_subscriptionservice_host
        api_key_service_serviceaccount_clientid            = var.api_key_service_serviceaccount_clientid
        api_key_service_spring_redis_config_hostvalues     = var.api_key_service_spring_redis_config_hostvalues
        api_key_service_spring_redis_config_pool_maxactive = var.api_key_service_spring_redis_config_pool_maxactive
        api_key_service_spring_redis_config_pool_maxidle   = var.api_key_service_spring_redis_config_pool_maxidle
        api_key_service_spring_redis_config_pool_minidle   = var.api_key_service_spring_redis_config_pool_minidle
        api_key_service_min_replicas                       = var.api_key_service_min_replicas
        api_key_service_max_replicas                       = var.api_key_service_max_replicas
        api_key_service_targetcpuutilizationpercentage     = var.api_key_service_targetcpuutilizationpercentage
        api_key_service_total_hosts_list                   = var.api_key_service_total_hosts_list
        api_key_service_hosts_list                         = var.api_key_service_hosts_list
        api_key_service_image_name                         = var.api_key_service_image_name
        api_key_service_image                              = var.api_key_service_image
        grpc_certificate_name                              = var.grpc_certificate_name
        api_key_service_ns                                 = var.api_key_service_ns
    })
  ]
}

#########################  HELM RELEASE FOR SUBSCRIPTION SERVICE   ###########################################
resource "helm_release" "subscription-service" {
  provider  = helm.custom
  timeout = 600
  name      = "${local.env_selected}-${var.service_name}-${var.region}-subscription"
  namespace = local.env_selected
  chart     = "${path.module}/helm-charts-application/helm-subscription-service"
  values = [
    templatefile("${path.module}/helm-charts-application/helm-subscription-service/values.yaml",
      {
        subscription_service_ns                                 = var.subscription_service_ns
        subscription_service_image_name                         = var.subscription_service_image_name
        subscription_service_image                              = var.subscription_service_image
        subscription_service_amazon_aws_region                  = var.subscription_service_amazon_aws_region
        subscription_service_historyservice_host                = var.subscription_service_historyservice_host
        subscription_service_amazon_dynamodb_tableprefixenv     = var.subscription_service_amazon_dynamodb_tableprefixenv
        subscription_service_logging_level_com_exp              = var.subscription_service_logging_level_com_exp
        subscription_service_apigateway_host                    = var.subscription_service_apigateway_host
        subscription_service_apikeyservice_host                 = var.subscription_service_apikeyservice_host
        subscription_service_serviceaccount_clientid            = var.subscription_service_serviceaccount_clientid
        subscription_service_spring_redis_config_hostvalues     = var.subscription_service_spring_redis_config_hostvalues
        subscription_service_spring_redis_config_pool_maxactive = var.subscription_service_spring_redis_config_pool_maxactive
        subscription_service_spring_redis_config_pool_maxidle   = var.subscription_service_spring_redis_config_pool_maxidle
        subscription_service_spring_redis_config_pool_minidle   = var.subscription_service_spring_redis_config_pool_minidle
        subscription_service_min_replicas                       = var.subscription_service_min_replicas
        subscription_service_max_replicas                       = var.subscription_service_max_replicas
        subscription_service_targetcpuutilizationpercentage     = var.subscription_service_targetcpuutilizationpercentage
        subscription_service_hosts_list                         = var.subscription_service_hosts_list
    })
  ]
}

Etc upto 40 helm releases( not adding all since there is a limit of characters)

### Debug Output

 ```Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: [resource mapping not found for name: "auth-service" namespace: "intr-dev" from "": no matches for kind "Deployment" in version "apps/v1"
│ ensure CRDs are installed first, error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/autoscaling/v1?hash=DC07551EE1EADD21E87E91FD4040FC5F8203635542FF2D5A3C29F095B805440E58D02306AEDF8C2634B32F13075E775B5E6F60651A32A094A304A84392CDC9A7&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.auth-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 66, in resource "helm_release" "auth-service":
│   66: resource "helm_release" "auth-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: error validating "": error validating data: unexpected error when reading response body. Please retry. Original error: net/http: request canceled (Client.Timeout or context cancellation while reading body)
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.signup-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 104, in resource "helm_release" "signup-service":
│  104: resource "helm_release" "signup-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: unable to recognize "": Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/api?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)

│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.email-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 275, in resource "helm_release" "email-service":
│  275: resource "helm_release" "email-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: [resource mapping not found for name: "bms-portal" namespace: "intr-dev" from "": no matches for kind "HorizontalPodAutoscaler" in version "autoscaling/v1"
│ ensure CRDs are installed first, error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/networking.istio.io/v1alpha3?hash=B6BA92B72621B497E454D98931320BA46AC88690B57C020533849CAD5291C2A58750E47BDE790BCDD3469D913D2E964620851D7ACDCE2A12B985BF7E74F02771&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-portal-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 597, in resource "helm_release" "bms-portal-service":
│  597: resource "helm_release" "bms-portal-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: [resource mapping not found for name: "bms-operator-device-service" namespace: "intr-dev" from "": no matches for kind "Deployment" in version "apps/v1"
│ ensure CRDs are installed first, error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/autoscaling/v1?hash=DC07551EE1EADD21E87E91FD4040FC5F8203635542FF2D5A3C29F095B805440E58D02306AEDF8C2634B32F13075E775B5E6F60651A32A094A304A84392CDC9A7&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-operator-device-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 700, in resource "helm_release" "bms-operator-device-service":
│  700: resource "helm_release" "bms-operator-device-service" {
│ 
╵
╷
│ Error: Kubernetes cluster unreachable: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/version": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-package-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 734, in resource "helm_release" "bms-package-service":
│  734: resource "helm_release" "bms-package-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: error validating "": error validating data: unexpected error when reading response body. Please retry. Original error: net/http: request canceled (Client.Timeout or context cancellation while reading body)
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.xca-platform-key-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 796, in resource "helm_release" "xca-platform-key-service":
│  796: resource "helm_release" "xca-platform-key-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: [unable to recognize "": Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/api?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/apps/v1?hash=346958BCFBEA6D545A8DE7BEE61057A26C45F3120D8DC8F45F457585A03692A87F8B93FF1893CEB340121534C9BC424C5695BBCD1372342BCC05FFA347CF0567&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.xca-device-key-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 846, in resource "helm_release" "xca-device-key-service":
│  846: resource "helm_release" "xca-device-key-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/autoscaling/v1?hash=DC07551EE1EADD21E87E91FD4040FC5F8203635542FF2D5A3C29F095B805440E58D02306AEDF8C2634B32F13075E775B5E6F60651A32A094A304A84392CDC9A7&timeout=32s": context deadline exceeded
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-emm-broker-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 896, in resource "helm_release" "bms-emm-broker-service":
│  896: resource "helm_release" "bms-emm-broker-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: unable to build kubernetes objects from new release manifest: [resource mapping not found for name: "bms-emm-id-service" namespace: "intr-dev" from "": no matches for kind "HorizontalPodAutoscaler" in version "autoscaling/v1"
│ ensure CRDs are installed first, error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/networking.istio.io/v1alpha3?hash=B6BA92B72621B497E454D98931320BA46AC88690B57C020533849CAD5291C2A58750E47BDE790BCDD3469D913D2E964620851D7ACDCE2A12B985BF7E74F02771&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-emm-id-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 920, in resource "helm_release" "bms-emm-id-service":
│  920: resource "helm_release" "bms-emm-id-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: [unable to recognize "": Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/apis?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/autoscaling/v1?hash=DC07551EE1EADD21E87E91FD4040FC5F8203635542FF2D5A3C29F095B805440E58D02306AEDF8C2634B32F13075E775B5E6F60651A32A094A304A84392CDC9A7&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-protected-item-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 1035, in resource "helm_release" "bms-protected-item-service":
│ 1035: resource "helm_release" "bms-protected-item-service" {
│ 
╵
╷
│ Error: error running dry run for a diff: [unable to recognize "": Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/apis?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), error validating "": error validating data: Get "https://DEED9E7F6B4A76758C78CAC39D1D246D.gr7.us-west-2.eks.amazonaws.com/openapi/v3/apis/networking.istio.io/v1alpha3?hash=B6BA92B72621B497E454D98931320BA46AC88690B57C020533849CAD5291C2A58750E47BDE790BCDD3469D913D2E964620851D7ACDCE2A12B985BF7E74F02771&timeout=32s": read tcp 192.168.0.100:51209->34.212.92.51:443: read: operation timed out]
│ 
│   with module.intr-dev-k8s-us-west-2-deployments.helm_release.bms-revoked-device-service,
│   on ../../../../intr-kprd-deployments-module/applications.tf line 1062, in resource "helm_release" "bms-revoked-device-service":
│ 1062: resource "helm_release" "bms-revoked-device-service" {```

### Expected Behavior

```It should just apply my new changes..```

### Actual Behavior

```It is getting timed out and giving errors as shown in the debug output.```

### Steps to Reproduce

```1. Create a tf file with 35-40 helm releases in it
2. Init and apply the terraform. For the first time it works
3. Next, when you just change a single variable from tfvars and try to apply, terraform takes a lot of time eventually only to get timedout```

Basically terraform helm provider is taking too long to read the helm_releases and its extremely frustrating and i had to do a target apply on the specific resource which wanted to change which is not ideal.

I tried more than 20 times and always i get the same error. Please note its not internet issue, i have 100Mbps speed.
Also, i tried increasing the timeout value in terraform helm_release resource, but no effect.

### Additional Context

_No response_

### References

_No response_

Debug Output

NOTE: In addition to Terraform debugging, please set HELM_DEBUG=1 to enable debugging info from helm.

Panic Output

Steps to Reproduce

Actual Behavior

Important Factoids

NO

References

GH-1234 NO

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

pranavnateri commented 1 year ago

Please suggest any workaround atleast if there is no solution as of yet

Regards, Pranav

jrhouston commented 1 year ago

Thanks for opening this @pranavnateri. Refreshing the state can take some time if you have many helm_release resources. Have you tried using Resource targeting (terraform apply -target ...) to target only the helm release you are actually changing?

pranavnateri commented 1 year ago

hi @jrhouston Thats exactly what i mentioned in the steps to reproduce section. I am doing the target apply which i do not want to. Also, The issue is same with or without -refresh=false while applying the terraform. Increasing the timeout value of the terraform helm provider resource to 600 is also not helping.

So taking too much time when multiple helm releases are there doesnt help because it will timeout after a certain while which is a bug.

Please suggest if there are any alternatives atleast until its fixed(PS: im already doing target apply because of this issue)

Regards, Pranav

pranavnateri commented 1 year ago

hi @jrhouston ... can you please update on this? It is getting timedout even for 10 helm releases :(

why this is an issue? was this not tested?

Regards, Pranav

pranavnateri commented 1 year ago

@jrhouston ??

oscardalmau-r3 commented 1 year ago

@jrhouston @pranavnateri are there any updates on this? I am hitting the same issue.

pranavnateri commented 1 year ago

Yes.. i still have the issue and no one is replying. there is no proper support @oscardalmau-r3

martivo commented 1 year ago

I also noticed that this has changed at some version. It used to be fast. I can also see that until 30-40 seconds it is not even creating a namespace. Hard to say what is causing it - it is also not installing any CRD and the kubernetes cluster is in local network (AWS EKS). Currently using version 2.10.1.

martivo commented 1 year ago

I did some testing: Kubernetes version 1.26 Terraform version v1.5.2

How I called the resourse:

resource "helm_release" "<redacted>" {
  name = <redacted>
  chart            = "localpathtochart" 
  namespace        = <redacted>
  create_namespace = true
  values = [
    templatefile("${path.module}/values.yaml", {
      <redacted>
    })
  ]
}

The helm_release uses a helm chart that is in the local filesystem (no chart registry download). Also the package does not have any CRD installation. The chart does have chart dependencies.

The time described for resources appearing is the resource creation message time shown. For example: module.test.helm_release.<redacted>: Still creating... [20s elapsed] So it does not include any TF planning or provider download/init time.

The time shown as total is the total time for terraform(init, apply, destroy...). It had a few AWS resources(took 1-2 seconds) and time needed to download the providers.

Provider 2.1.2 took ~20s for namespace+pods to appear. (total 153s apply+destroy) Provider 2.2.0 took ~20s for namespace+pods to appear. (total 159s apply+destroy) Provider 2.4.1 took ~20s for namespace+pods to appear. (total 163s apply+destroy) Provider 2.5.0 took ~2MINUTES for namespace+pods to appear. (total 358s apply+destroy) (did several attempts) Provider 2.5.1 took ~2MINUTES for namespace+pods to appear. (total 388s apply+destroy) (did several attempts) Provider 2.6.0 took ~2MINUTES for namespace+pods to appear. (total 343s apply+destroy) (did several attempts) Did not bother to check when resources are starting to appear, just posting total time of apply+destroy. Provider 2.7.1 total 341s apply+destroy Provider 2.8.0 total 339s apply+destroy Provider 2.10.1 total 343s apply+destroy Provider 2.11.0 total 341s apply+destroy

Since version 2.5.0 it went from 20s to 2minutes for any resources to appear in kubernetes. The issue is also with destroy, that for couple of minutes it does nothing since 2.5.0!

As a workaround I am now using 2.4.1. If you have upgraded to higher version and can't delete the resource then I am not sure how to downgrade it.

I hope finding the version where it broke helps to locate the issue. For sure there is some sort of problem with this.

danielskowronski commented 1 year ago

Massive +1 for this issue with more background why reverting to 2.4.1 is painful or impossible: bundled Helm client is ancient there and makes modern charts incompatible (e.g. Traefik requires Helm > 3.9.0 since November 2022, yet so old Helm seems not to work with modern k8s version for some reason). I use just a couple of charts, but my cluster is very remote, so the slowness makes it time out very often.

I'd be happy to help with the issue, but the size of diff between 2.4.1 and 2.5.0 is so large I have no idea where to start. If needed, I can try to tweak my deployment, so I get rid of everything that requires newer Helm and verify the problem lies with this particular version.

For the moment, I can confirm that disable_openapi_validation=false does not solve the issue as I thought after debugging other issue (ref: https://github.com/hashicorp/terraform-provider-helm/issues/513).

BTW, Traefik Helm chart v24.0.0 seems to be a good test candidate as it creates a large amount of CRDs.

ptonini commented 1 year ago

+1

jjchambl commented 1 year ago

@pranavnateri have you seen/tried the option exposed in this PR? It specifically mentions slowness due to excessive CRDs from Crossplane, but may be worth trying even if it's not your specific issue. I'll be trying this option out as well.

martivo commented 1 year ago

@pranavnateri have you seen/tried the option exposed in this PR? It specifically mentions slowness due to excessive CRDs from Crossplane, but may be worth trying even if it's not your specific issue. I'll be trying this option out as well.

Thank you! This solved it for us. We do have crospslane and many other CRD-s. Tested with provider 2.11.0.

provider "helm" {
  burst_limit = 300
  kubernetes {
  ...
  }
}

danielskowronski commented 1 year ago

It worked for me too! burst_limit set to 900 for remote cluster, with Traefik as major user of CRDs.

That default value 100 from Helm itself seems comically low for any real deployment to me. However, it looks like the Terraform provider cannot interpret valid throttling messages from the Helm library and crashes instead of providing useful information (or attempting to retry the operation).

mustafa-be commented 1 year ago

+1 for this issue, i m getting timeouts for only 2 helm releases being created the first time.

│ Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: unexpected error when reading response body. Please retry. Original error: net/http: request canceled (Client.Timeout or context cancellation while reading body)

mattiadevivo commented 10 months ago

+1 in my case it seems to be quite random as behaviour, sometimes it takes just a few seconds, others minutes.

L1ghtman2k commented 8 months ago

this seemed to be caused by large number of CRDs on the server.

I notice that kubectl get --raw /openapi/v3 takes over 30 seconds to respond(and this behavior is inconsistent), which causes helm_release (which seems to be configured with static 30 second client timeout) to fail, with an error like:

╷
│ Error: unable to build kubernetes object for pre-delete hook kyverno/templates/hooks/pre-delete.yaml: error validating "": error validating data: the server was unable to return a response in the time allotted, but may still be processing the request
│
│
╵

We are running into this issue in vcluster, so I am linking it to this issue: https://github.com/loft-sh/vcluster/issues/1588

L1ghtman2k commented 8 months ago

A configurable timeout parameter would be nice to address issues when /openapi/v3 endpoint takes > 30 seconds to respond: https://github.com/hashicorp/terraform-provider-helm/issues/463

Edit: unfortunately, this seems to not be possible in base helm library: https://github.com/helm/helm/issues/9805

My current workaround for this issue is to use terragrunt as a wrapper, and use the auto-retry feature: https://terragrunt.gruntwork.io/docs/features/auto-retry/, usually on the 2nd pass, the /openapi/v3 endpoint is fast to respond

gothka commented 8 months ago

This is becoming more of a major issue even with less than 80 CRDs in the cluster and setting burst_limit doesn't help here. Any workarounds?

hashicorp / terraform-provider-helm