hashicorp / terraform-provider-kubernetes-alpha

A Terraform provider for Kubernetes that uses dynamic resource types and server-side apply. Supports all Kubernetes resources.
https://registry.terraform.io/providers/hashicorp/kubernetes-alpha/latest
Mozilla Public License 2.0
490 stars 63 forks source link

Error: rpc error: code = Unknown desc = failed to validate provider configuration #133

Open red8888 opened 3 years ago

red8888 commented 3 years ago

Terraform v0.12.12

Similar as this https://github.com/hashicorp/terraform-provider-kubernetes-alpha/issues/124

And it seems like this was suppose to fix this issue https://github.com/hashicorp/terraform-provider-kubernetes-alpha/pull/65 ? was that released in version 0.2.1? It was a few months ago

I want to create a gke cluster and apply a manifest (and not have to do this in multiple steps/terraform states)

provider "kubernetes-alpha" {
  server_side_planning = true
  token = data.google_client_config.provider.access_token
  host  = "https://${google_container_cluster.mycluster.endpoint}"
  cluster_ca_certificate = base64decode(
    google_container_cluster.mycluster.master_auth[0].cluster_ca_certificate,
  )
}

resource "google_container_cluster" "cluster" {
... create the cluster

resource "kubernetes_manifest" "test-patch-configmap" {
.... apply a manifest
even adding a depends on here doesnt fix it: `depends_on = [google_container_cluster.mycluster]`

get error: 
Error: rpc error: code = Unavailable desc = transport is closing

If I create the cluster first and then add the kubernetes-alpha provider config and kubernetes_manifest resource it works

benfdking commented 3 years ago

Getting the same error, with the following config. Weirdly the kuberntes provider works a treat. On

data "google_client_config" "current" {}

provider "kubernetes" {
  load_config_file       = false
  host                   = module.google.cluster_endpoint
  cluster_ca_certificate = base64decode(module.google.cluster_ca_certificate)
  token                  = data.google_client_config.current.access_token
}

provider "kubernetes-alpha" {
  server_side_planning = true
  host                   = module.google.cluster_endpoint
  cluster_ca_certificate = base64decode(module.google.cluster_ca_certificate)
  token                  = data.google_client_config.current.access_token
}

Edit:

This is also happening for me for Terraform v0.13.5

red8888 commented 3 years ago

Yes the regular kubernetes provider and even the helm provider work correctly so this is an issue with kubernetes-alpha

I saw a bunch of semi related issues, but I don't think for this specifically. Does that PR fix it? It was merged months ago but I cant find whether it was included in the latest release

I would consider this a pretty significant bug. Your going to want to provision a cluster and apply some foundational config in the same state or module.

jondunning commented 3 years ago

I'm getting the same issue with terraform 0.13.5

jmservianri commented 3 years ago

Im getting the same issue too with terraform 0.13.5

unacceptable commented 3 years ago

I think that a more concise problem statement would be that host is not allowed to be null in the provider block.

This prevents us from using this provider inside modules where we stand up a kubernetes cluster. This is an issue that the hashicorp/kubernetes provider does not have.

I am looking into a hacky HCL fix now or a proper Go fix, but I'm still a Go noob.

bryankaraffa commented 3 years ago

If I create the cluster first and then add the kubernetes-alpha provider config and kubernetes_manifest resource it works

I can confirm I get the same behavior. We avoid this with the official kubernetes provider setting provider.kubernetes.load_config_file tofalse. When attempting to do this with the current hashicorp/kubernetes-alpha provider without the cluster / kubeconfig available, this is the error:

provider "kubernetes-alpha" {
  load_config_file = false
  host             = module.cluster.host
  username         = yamldecode(module.cluster.kubeconfig).users[0].user.username
  password         = yamldecode(module.cluster.kubeconfig).users[0].user.password
  cluster_ca_certificate = base64decode(
    yamldecode(module.cluster.kubeconfig).clusters[0].cluster.certificate-authority-data
  )
}
Error: Unsupported argument

  on providers.tf line 28, in provider "kubernetes-alpha":
  28:   load_config_file = false

The kubernetes and helm providers both support the load_config_file provider configuration:

provider "kubernetes-alpha" {
  load_config_file = false
  host             = module.cluster.host
  username         = yamldecode(module.cluster.kubeconfig).users[0].user.username
  password         = yamldecode(module.cluster.kubeconfig).users[0].user.password
  cluster_ca_certificate = base64decode(
    yamldecode(module.cluster.kubeconfig).clusters[0].cluster.certificate-authority-data
  )
}

provider "kubernetes" {
  load_config_file = false
  host             = module.cluster.host
  username         = yamldecode(module.cluster.kubeconfig).users[0].user.username
  password         = yamldecode(module.cluster.kubeconfig).users[0].user.password
  cluster_ca_certificate = base64decode(
    yamldecode(module.cluster.kubeconfig).clusters[0].cluster.certificate-authority-data
  )
}

provider "helm" {
  kubernetes {
    load_config_file = false
    host             = module.cluster.host
    username         = yamldecode(module.cluster.kubeconfig).users[0].user.username
    password         = yamldecode(module.cluster.kubeconfig).users[0].user.password
    cluster_ca_certificate = base64decode(
      yamldecode(module.cluster.kubeconfig).clusters[0].cluster.certificate-authority-data
    )
  }
}
hadim commented 3 years ago

I have the same error. I think supporting load_config_file could help.

tiarebalbi commented 3 years ago

+1 having the same issue.

I added a step to create the kubeconfig locally to test and even with the configuration file defined I'm seeing the same error:

Error: Error: rpc error: code = Unknown desc = failed to validate provider configuration

resource "local_file" "kube-config" {
  content = templatefile("${path.module}/template/kubeconfig.tmpl", {
    host = scaleway_k8s_cluster_beta.k8s_cluster_staging.kubeconfig[0].host
    token = scaleway_k8s_cluster_beta.k8s_cluster_staging.kubeconfig[0].token
    cluster_ca_certificate = scaleway_k8s_cluster_beta.k8s_cluster_staging.kubeconfig[0].cluster_ca_certificate
  })
  filename = ".kube/config"
  depends_on = [scaleway_k8s_pool_beta.k8s_pool_staging]
}
tiarebalbi commented 3 years ago

Update:

I found a workaround for now, which I believe be the problem.

Steps:

  1. Hide all configurations that are using the kubernetes-alpha
  2. Run terraform apply and wait for completion
  3. Unhide all configuration again and run apply again.

I believe the issue is when you don't have the cluster created and you try to define the host and the other properties.

aareet commented 3 years ago

This seems to be an issue with defining the cluster in the same apply operation as the other resources as @tiarebalbi describes. We don't support this use case currently in either the Kubernetes or the Alpha provider (https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#stacking-with-managed-kubernetes-cluster-resources). While it may work in some situations, it's a flaky implementation that should not be relied on.

techdragon commented 3 years ago

@aareet I've had this issue/error with an existing cluster, even with a working .kubeconfig file that functions fine with kubectl. I don't have an exact reproduction with all the version numbers on hand, but I definitely ran into this when trying to use the latest version of this provider.

modevops commented 3 years ago

I do not understand why this is so difficult to figure out. I am seeing a similar issue. The kubernetes and helm providers work fine. I am having the issue when I try to build eks cluster and deploy base containers. Seeing the rpc errors tells me that it trying to connect to cluster when you do terraform plan. In my case I have not deployed the cluster yet so it cannot connect to the cluster. Why cannot the kubernetes-alpha provider be set up like the kubernetes or helm provider. To me the providers setup should act the same. In current use case of building a base kubernetes clusters with base containers I cannot use this providers because it is trying to connect to kubernetes cluster during terraform plan. Why is not configured like kubernetes or helm provider that do not try to connect cluster in pre-deploy stage? To me this should be an easy fix. I like the suggestion about to add load_config_file = false. Hasicorp needs to do a better job with integration with kubernetes. Right now there are major gaps and they need make this a higher priority fix. Right now I am trying to so my company that terraform can be a good tool to setup and manage kubernetes but right now I see major gaps that hasicorp is ignoring and that could lead to lose in market share. Hasicorp needs to get its act together on kubernetes support. This project needs a lot more attention and moving too slowly. I am tired of using null_resources to manage kubernetes this project has lot of needed fixes that terraform needs but hasicorp inattention to this project shows that hasicorp does not care. When will this be fixed. I see more than a dozen issue related to this and this should have been fixed already. This project has been going 13 months and it is still in alpha. Hasicorp stop dragging your feet and truly support kubernetes. If you do not have the issue with helm or kuberenetes should you not look at what works?

my provider setup provider "kubernetes-alpha" { host = data.aws_eks_cluster.cluster.endpoint cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data) token = data.aws_eks_cluster_auth.cluster.token }

My Errors Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration

aareet commented 3 years ago

@techdragon unfortunately it's hard for us to reproduce without a config that demonstrates the issue. If you're able to come up with a reproduction case, please let us know, we'll investigate.

modevops commented 3 years ago

I get the above error when I am creating a new aws eks cluster. I use terraform to create a cluster then I am installing cert-manager using helm and I and to install the crd's using the kubernetes alpha. I have a terraform module that is doing the following.

  1. Create a EKS Cluster
  2. Install base services such as cert-manger, external-dns, nginx-ingress, prometheus etc....
  3. Cert-manager and few other services have crds I need to install. I want to use kubernetes-alpha to install the crds. currently I am using

resource "null_resource" "crd" { provisioner "local-exec" { command = "kubectl apply -f ${path.module}/charts/cert-manager.crds.yaml" interpreter = ["/bin/bash", "-c"] } depends_on = [kubernetes_namespace.cert_manager] }

My Errors Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration Error: rpc error: code = Unknown desc = no client configuration

Here are the providers I am using: terraform { required_version = ">= 0.12.2" }

provider "aws" { version = ">= 2.28.1" region = var.region }

provider "local" { version = "~> 1.2" }

provider "null" { version = "~> 2.1" }

provider "template" { version = "~> 2.1" }

provider "kubernetes" { host = data.aws_eks_cluster.cluster.endpoint cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data) token = data.aws_eks_cluster_auth.cluster.token load_config_file = false version = "~> 1.11" }

provider "helm" { version = "~> 1.0" kubernetes { host = data.aws_eks_cluster.cluster.endpoint cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data) token = data.aws_eks_cluster_auth.cluster.token load_config_file = false } }

when I try to use

provider "kubernetes-alpha" { host = data.aws_eks_cluster.cluster.endpoint cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data) token = data.aws_eks_cluster_auth.cluster.token }

I get errors.

If i use a null resources to apply the crds I get no errors. But if I use kubernetes-alpha I get errors. If I use kubernetes-alpha after I have deployed the cluster it works but that will not work for my use case. I believe the load_config_file = false flags enables the kubernetes and helm provider work where as the kubernetes-alpha does not have that flag. I should not loose existing features with a new provider.

patrykwojtynski commented 3 years ago

Same here:

terraform {
  required_providers {
    helm = {
      source = "hashicorp/helm"
      version = "1.3.2"
    }
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "1.13.3"
    }
    kubernetes-alpha = {
      source = "hashicorp/kubernetes-alpha"
      version = "0.2.1"
    }
    google = {
      source = "hashicorp/google"
      version = "3.50.0"
    }
  }
  required_version = "0.13.5"
}

provider "google" {
  project = var.gcp_project_id
  region = var.gcp_region
  credentials = file(var.gcp_credentials_file_path)
}

provider "kubernetes" {
  load_config_file = false
  host = "https://${var.gcp_cluster_endpoint}"
  cluster_ca_certificate = base64decode(var.gcp_cluster_ca_certificate)
  token = var.gcp_access_token
}

provider "kubernetes-alpha" {
  server_side_planning = true
  host = "https://${var.gcp_cluster_endpoint}"
  cluster_ca_certificate = base64decode(var.gcp_cluster_ca_certificate)
  token = var.gcp_access_token
}

With pure deployment sample it throws errors. When I downgrade to 0.2.0 it works.

lawliet89 commented 3 years ago

I second what @patrykwojtynski has found. If I use 0.2.0 it works. It might be one of the changes between 0.2.0 and 0.2.1 that broke this.

provider "kubernetes-alpha" {
  host                   = coalesce(var.kubernetes_host, data.terraform_remote_state.gke.outputs.endpoint)
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = coalesce(var.kubernetes_ca_certificate, base64decode(data.terraform_remote_state.gke.outputs.ca_certificate))
}

data "google_client_config" "default" {
}

data "terraform_remote_state" "gke" {
  backend = "gcs"

  config = {
    bucket = var.remote_state_bucket
    prefix = var.gke_state_prefix
  }
}

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 3.0"
    }
    kubernetes = {
      source = "hashicorp/kubernetes"
    }
    kubernetes-alpha = {
      source  = "hashicorp/kubernetes-alpha"
      version = "~> 0.2, != 0.2.1"
    }
  }
  required_version = ">= 0.13"
}
linuxbsdfreak commented 3 years ago

Hi @alexsomesan

I am having the following definition

terraform {
  required_version = ">= 0.13.0"
  required_providers {
    kubernetes-alpha = {
      source  = "hashicorp/kubernetes-alpha"
      version = "~> 0.2.1"
    }
    vault = {
      source  = "hashicorp/vault"
      version = "~> 2.12"
    }
  }
}

provider "kubernetes-alpha" {
 server_side_planning = true
 //config_path     = local_file.robot_k8s_config.filename
 config_path     = var.robotkubeconfig
}

provider "vault" {
  alias = "child_namespace"
  address    = var.vault_address
  token      = var.vault_token
  namespace =  var.vault_child_namespace
}

data "vault_generic_secret" "robot_k8s_config" {
  provider = vault.child_namespace
  path = join("/", [ var.robot_k8s_secret_path, var.robot_k8s_secret_key ])
}

resource "local_file" "robot_k8s_config" {
    filename = "robot_k8s.kubeconfig"
    sensitive_content  = data.vault_generic_secret.robot_k8s_config_creds_read.data.config
    file_permission = "644"
}

data "vault_generic_secret" "robot_k8s_config_creds_read" {
  provider = vault.child_namespace
  path = join("/", [ var.robot_k8s_secret_path, var.robot_k8s_secret_key ])
}

If i set the following it crashes with the above error. With the default K8s provider it works

config_path = local_file.robot_k8s_config.filename

I have to provide an explicit filepath via a variable

What is the issue here ?

linuxbsdfreak commented 3 years ago

Hi @alexsomesan

I am doing the following for the provider

provider "kubernetes-alpha" {
  alias = "robot"
  server_side_planning = true

  host = yamldecode(data.vault_generic_secret.robot_k8s_config_creds_read.data.config).clusters[0].cluster.server
  cluster_ca_certificate = yamldecode(data.vault_generic_secret.robot_k8s_config_creds_read.data.config).clusters[0].cluster.certificate-authority-data
  token = yamldecode(data.vault_generic_secret.robot_k8s_config_creds_read.data.config).users[0].user.token
  config_context_cluster  = yamldecode(data.vault_generic_secret.robot_k8s_config_creds_read.data.config).contexts[0].context.cluster
  config_context_user  = yamldecode(data.vault_generic_secret.robot_k8s_config_creds_read.data.config).contexts[0].context.user
  //insecure = true

  //config_path     = module.robot-k8s-config.robot_k8s_credentials
}

I am reading the config from vault server within TF. If i use config_path it works. However i have to create a temporary file which i would like to avoid. The standard K8s provider works without any issues. Is this a bug? I hope this gets fixed.

Kevin

artificial-aidan commented 3 years ago

This worked for me:

provider "kubernetes" {
  load_config_file    =   false
  # See https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
  version =   "~> 1.10.0"
  ...
}

That's not kubernetes-alpha.

siassaj commented 3 years ago

I also get this behavior on terraform cloud when specifying

provider "kubernetes-alpha" {
  host                 = module.gke_cluster.cluster.endpoint
  server_side_planning = true

  username = var.gke_username
  password = var.gke_password

  client_certificate = base64decode(module.gke_cluster.cluster.master_auth.0.client_certificate)
  client_key             = base64decode(module.gke_cluster.cluster.master_auth.0.client_key)
  cluster_ca_certificate = base64decode(module.gke_cluster.cluster.master_auth.0.cluster_ca_certificate)
}
johngtam commented 3 years ago

Am also getting the same behavior on Terraform Enterprise:

provider "kubernetes-alpha" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  version                = "0.2.1"
  [...]
}

Was able to replicate the fact that only after creating the cluster can I put back in resources that depend on the kubernetes-alpha provider :(. On Terraform version 0.14.6 these days.

salahlemtiri commented 3 years ago

Any news on this ? or at least is there any workaround for this issue ?

pschiffe commented 3 years ago

I'm also having this issue.

Terraform v0.14.8
+ provider registry.terraform.io/hashicorp/aws v3.22.0
+ provider registry.terraform.io/hashicorp/helm v2.0.2
+ provider registry.terraform.io/hashicorp/kubernetes v2.0.2
+ provider registry.terraform.io/hashicorp/kubernetes-alpha v0.2.1
+ provider registry.terraform.io/hashicorp/tls v3.0.0

I'm trying to use

resource "kubernetes_manifest" "target_group_binding" {
  provider = kubernetes-alpha

  manifest = {
    apiVersion = "elbv2.k8s.aws/v1beta1"
    kind       = "TargetGroupBinding"
    metadata = {
      name      = local.name
      labels    = local.labels_full
      namespace = "default"
    }
    spec = {
      targetType     = "ip"
      targetGroupARN = aws_lb_target_group.this[0].arn
      serviceRef = {
        name = kubernetes_service.this[0].metadata[0].name
        port = kubernetes_service.this[0].spec[0].port[0].port
      }
    }
  }
}

But if the aws_lb_target_group and/or kubernetes_service doesn't already exist, I'll get the Error: rpc error: code = Unavailable desc = error. If I comment this resource out, create the target group and service at first tf apply, I can create this kubernetes_manifest in the second tf apply.

alexsomesan commented 3 years ago

@pschiffe I see you are reporting this for provider version v0.2.1 Could you try the same with the latest version and let us know if you still have this issue? Thanks!

alexsomesan commented 3 years ago

Everyone else, I see a couple of different configurations reported here. Can you confirm whether or not you are creating the cluster itself at the same time?

This provider, due to the fact that it relies on OpenAPI resource types from the clusters, requires that the API is present and responsive during plan time. Because of this, creating the cluster in the same apply is not supported.

siassaj commented 3 years ago

My cluster exists when I get the error

pschiffe commented 3 years ago

@alexsomesan my cluster exists as well when running the first apply.

I'm not able to use the latest version of provider b/c of other, already reported issues against 0.3.x

techdragon commented 3 years ago

My cluster exists when I get the error.

lodotek commented 3 years ago

No progress here? :-(

unacceptable commented 3 years ago

@lodotek I am disappointed about that as well, but it sounds like in the meantime you could use 0.2.0 (I haven't tested this myself yet because I haven't needed to circle back to it & I was hoping a future release would resolve the issue).

lodotek commented 3 years ago

@lodotek I am disappointed about that as well, but it sounds like in the meantime you could use 0.2.0 (I haven't tested this myself yet because I haven't needed to circle back to it & I was hoping a future release would resolve the issue).

I tried, but it was causing a stack trace of some sort. Sorry I did not capture the error. For now I just applied that resource after the cluster and everything else was up.

pschiffe commented 3 years ago

I'm not seeing this issue anymore with v0.5