hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.43k stars 9.51k forks source link

Configuring one provider with a dynamic attribute from another (was: depends_on for providers) #2430

Closed dupuy closed 4 months ago

dupuy commented 9 years ago

This issue was inspired by this question on Google Groups.

I've got some Terraform code that doesn't work because the EC2 instance running the Docker daemon doesn't exist yet so I get "* Error pinging Docker server: Get http://${aws_instance.docker.public_ip}:2375/_ping: dial tcp: lookup ${aws_instance.docker.public_ip}: no such host" if I run plan or apply.

There are providers (docker and consul - theoretically also openstack but that's a stretch) that can be implemented with Terraform itself using other providers like AWS; if there are other resources in a Terraform deployment that use the (docker or consul) provider they cannot be provisioned or managed in any way until and unless the other resources that implement the docker server or consul cluster have been successfully provisioned.

If there were a depends_on clause for providers like docker and consul, this kind of dependency could be managed automatically. In the absence of this, it may be possible to add depends_on clauses for all the resources using the docker or consul provider, but that does not fully address the problem as Terraform will attempt (and fail, if they are not already provisioned) to discover the state of the docker/consul resources during the planning stage, long before it has completed the computation of dependencies. Multiple plan/apply runs may be able to resolve that specific problem, but having a depends_on clause for providers would allow everything to be managed in a single pass.

alexsomesan commented 2 years ago

Very comprehensive and accurate description of the state of things by @apparentlymart above!

I would like to add to that and elaborate on how the Kubernetes provider chooses handle the scenario just described.

All but one resources in the Kubernetes provider are "classic" terraform resources, with their schema defined statically and their logic built on top of the historical SDK whose limitations where already explained above. These resources DO NOT need access to the cluster API at planning time and can actually produce a plan in absence of a complete provider configuration. With all these resources, the actual initialization of the API client is deferred until the very first CRUD function of a resource is called, which most of the times is at apply time. As long as all the unknown inputs to the provider block attributes can be resolved to concrete values before apply, they will work as expected.

However, there is one resource in the Kubernetes provider that DOES require access to the API at planning time, and that is the kubernetes_manifest resource. As @apparentlymart mentioned above, this resource relies on schema information retrieved at run-time from the API in order to prepare and return a valid plan. Due to this requirement, the "trick" of defering the actual API client initialization as described earlier, doesn't work in this case. When using any instance of the kubernetes_manifest all the configuration in the provider block must be known and valid at plan time.

samirshaik commented 2 years ago

This was one of the most interesting threads to read through, but I wasn't able to fully comprehend if anything could be done to provide solution for this use case. We too have a similar use-case for VMC, where SDDC is created first and then proxy-url from it is used by NSXT Terraform provider.

Keeping the modules separate is the only solution at the moment? Would any effort be made to provide solution for this in the future?

jemccabe commented 2 years ago

We also have several use-cases that would benefit from this feature

CMorton737 commented 2 years ago

I don't want to pile on, but I feel the need to point out the irony that achieving full automation of Hashicorp Vault and/or Hashicorp Consul configurations with their respective providers is blocked by one of the most upvoted issues in the Hashicorp Terraform issue tracker.

https://github.com/hashicorp/terraform-provider-vault/issues/1198

Lucasjuv commented 2 years ago

Anyone knows whether terragrunt allows for that ? To first run creation of X set of resources to be able to use them in generated provider ?

Yes it does, in terragrunt you're able to create dependencies between modules and pass variables through it. You can also generate provider configuration files based on these variables. It has been a good workaround for me. Sometimes you might face problems planning with a provider with missing variables but when you apply it works.

cranzy commented 2 years ago

Good conversation and explanations. My team is experiencing this issue as well.

so-jelly commented 1 year ago

I have not read this complete thread, but another use case I have is mocking tests. In the happy path, I use a provider to get data. In testing / CI, I don't use the resource the provider... provides, but it still tries to authenticate with resources all having a count of 0.

brandongallagher1999 commented 1 year ago

I guess I'll unfortunately need to create my K8s cluster in previous Terraform layer prior to applying my K8s manifests. This is quite tedious and prolongs spinning up a new environment. I don't think allowing providers to wait for a resource is going to be implemented any time soon.

magzim21 commented 1 year ago

It works.

provider "elasticsearch" {
  url         = "https://${var.opensearch_alpha_url}"
#  THIS IS a fix from me.
  elasticsearch_version = format("%s%s","OpenSearch_2.3", trim(module.opensearch.arn, module.opensearch.arn))
  username = data.aws_ssm_parameter.opensearch_admin_username.value
  password = data.aws_ssm_parameter.opensearch_admin_password.value
  sign_aws_requests = false # https://github.com/phillbaker/terraform-provider-elasticsearch/issues/318#issuecomment-1371722505
  aws_region  = var.region
  healthcheck = false

  # depends_on = [
  #   module.opensearch
  # ]

}
acederberg commented 1 year ago

@magzim21 Which version? I am using v1.4.0 and I get the following error:

│ Error: Reserved argument name in provider block
│ 
│   on provider.tf line 48, in provider "helm":
│   48:   depends_on = [
│ 
│ The provider argument name "depends_on" is reserved for use by Terraform in a future version.

so I am inclined to believe that upgrading will fix my issue. However looking at the releases page I have the most recent release.

sbocinec commented 1 year ago

@acederberg I think you misread something - I don't see mentioned in the @magzim21's comment example, that depends_on can be set on the provider level. What you are trying to achieve is not supported, even the provider upgrade will not fix the issue for you. It must be first implemented and supported in terraform.

jbg commented 1 year ago

@acederberg To clarify: 1.4.0 is the latest release. The argument depends_on for providers is reserved in case Terraform implements something using this name in future (which may or may not happen, I have no knowledge either way). It is necessary to prevent providers from using this argument so that there is no backward compatibility problem if Terraform later uses this argument.

grzegorzjudas commented 1 year ago

I struggled with this for a long time, and two things helped me "get over it", in a way.

  1. My perspective on what terraform is was wrong. It's not a tool to make creation/destruction of the whole platform doable in one go. It's a tool to describe and modify it incrementally. This changes everything - once you get on terms with the fact that it's "just" that you realize you can build your platform piece-by-piece and apply each chunk, so most likely once you get to provider B that has dependency on provider A, you'd already have it applied and running for some time.

  2. If you really need to be able to chain providers to create the whole environment in one go, you could make use of the target flag, by i.e. writing a simple script:

setup.sh

#!/bin/bash
for target in $(cat order.txt)
do
  eval "terraform $1 -target=$target"
done

eval "terraform $1"

order.txt:

google_container_cluster.my_cluster
kubernetes_secret.my_secret_stuff
kubernetes_manifest.deployment_using_secret_above

Use with:

$ ./setup.sh apply

This way you could, in a very simple manner, describe your dependency chain, and running the script would run several apply's in the order from the file, with the remaining, not added resources being applied last. So with the example above, it'd:

  1. Create the GKE cluster and only its own dependencies (kubernetes provider and the fact its setup config is incorrect would be ignored)
  2. Create the kubernetes secret (at this point, GKE cluster already got applied so the provider for k8s would work, secret would get created)
  3. Now we deploy some app that needs that secret to be in place, if for whatever reason we can't do it using depends_on
  4. Finally, we run terraform apply for the remaning resources, the already created ones wouldn't be touched.

It's more like a workaround than a solution, but assuming provider-level dependencies are rather rare case, you probably wouldn't end up in more than 1-3 entries in that order.txt file.

jbg commented 1 year ago

It's more like a workaround than a solution, but assuming provider-level dependencies are rather rare case, you probably wouldn't end up in more than 1-3 entries in that order.txt file.

I am yet to come across any large-ish Terraform config that doesn't have the kind of dependencies (like resource -> provider -> resource) that force repeated use of -target (or alternatively, splitting the entire config into a separate config for each "layer", with arms-length dependencies — like remote state — between them). And usually many more than 3 "layers".

Example:

  1. Resources: EKS cluster
  2. Provider: Kubernetes
  3. Resources: cert-manager (includes CRDs, so can't be deployed together with next layer)
  4. Resources: PostgreSQL deployment which includes cert-manager Certificate custom resources
  5. Provider: PostgreSQL
  6. Resources: PostgreSQL database / roles / grants & some app which uses them

None of these layers can be deployed together with the previous one, because each layer needs to use the resources created by the previous one at plan time. There is nothing special about the specific choice of EKS/k8s/cert-manager/postgres here, these kinds of dependency chains pop up all over the place.

It's fine to build this up one layer at a time once. But in some cases you want to destroy and recreate environments from scratch, even regularly. Also, you might want to put it all in a module and, for example, deploy the same infra in another region. So you end up with hacks like a bash script for deploying one layer at a time with -target, which loudly warns you every time you use it that you should rarely use it (a joke at this point because it's virtually impossible to use Terraform without it!)

grzegorzjudas commented 1 year ago

Don't get me wrong, I agree fully that terraform would be a much more powerful and useful tool if provider dependencies were possible. And the examples you provided also make total sense.

While reading it I myself realized too that i.e. regular testing of the whole platform code in a sandbox to make sure everything building it up is described in HML and working would be a valid scenario.

In any case, that workaround thankfully makes the job done, at least until something better pops up.

Lucasjuv commented 1 year ago

Hi guys, just a work around. Do you guys know Terragrunt?

It allows you to wrap modules with it's own folder and configuration scheme and generate terraform files based on module outputs. You can configure dependency blocks so It knows the blocks dependencies.

It even allows you to mock module outputs so the planning works just fine. The idea is that you can have a module deploy an EKS cluster and the next module generate a k8s provider block with the output from the EKS module.

I've been tracking this issue for 2 or 3 years now and I honestly don't think Hashi will work on this anytime soon. I think it is somewhat hard to implement this depends on in providers because the way terraform works. I mean I can't imagine running terraform init with a provider that can't be configured.

Terragrunt isn't easy to use but it is the next step for automating multiple modules and avoid running multiple applies with targets.

colans commented 1 year ago

The problem with Terragrunt is that it's one more thing to deal with.

So you end up with hacks like a bash script for deploying one layer at a time with -target, which loudly warns you every time you use it that you should rarely use it (a joke at this point because it's virtually impossible to use Terraform without it!)

We're all doing this, yes. So maybe we just need to get the docs updated, and leave it at that? Cross out the "rarely", essentially.

rchernobelskiy commented 1 year ago

Probably a lot of people come here looking for how to provision Kubernetes objects with the same terraform that is creating a kube. The way to do that is to avoid using a data source and rather referencing a module output and perhaps exec. For example, to provision Kubernetes objects after making a kube with the eks module, the kubernetes provider should be set up roughly as follows:

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
  }
}

--role-arn can be included above if your aws provider is assuming a role also

brandongallagher1999 commented 1 year ago

@rchernobelskiy Isn't that provider executed at build time? Meaning that the module of which it's pointing to already needs to exist prior to?

rchernobelskiy commented 1 year ago

@rchernobelskiy Isn't that provider executed at build time? Meaning that the module of which it's pointing to already needs to exist prior to?

In practice that seems to not be the case and arguments from other modules seem to be ok to put in providers prior to those modules existing.

jbg commented 1 year ago

@rchernobelskiy Isn't that provider executed at build time? Meaning that the module of which it's pointing to already needs to exist prior to?

In practice that seems to not be the case and arguments from other modules seem to be ok to put in providers prior to those modules existing.

As long as no resources are added yet that depend on the provider being added. In which case why are you adding the provider?

rchernobelskiy commented 1 year ago

@rchernobelskiy Isn't that provider executed at build time? Meaning that the module of which it's pointing to already needs to exist prior to?

In practice that seems to not be the case and arguments from other modules seem to be ok to put in providers prior to those modules existing.

As long as no resources are added yet that depend on the provider being added. In which case why are you adding the provider?

I'm using one terraform apply to both create an eks kube and then provision kubernetes objects in it, like configmaps. This works in one go, one apply command, even though the configuration for the kubernets provider is based on a kube that does not exist yet at the time of running terraform apply.

jbg commented 1 year ago

I'm using one terraform apply to both create an eks kube and then provision kubernetes objects in it, like configmaps.

This works in one go, one apply command, even though the configuration for the kubernets provider is based on a kube that does not exist yet at the time of running terraform apply.

It may work if you don't have any resources which depend on k8s resources via for_each or count, don't use any k8s data sources which need to be read at plan time, and don't use the kubernetes_manifest resource. It will fail with anything that requires the provider to connect to the cluster at plan time (since the cluster doesn't exist yet). Non-trivial configurations tend to have things that require cluster access at plan time, as discussed above in this issue.

edicsonm commented 1 year ago

Hi, I had the same issue when configuring a Kubernetes provider that uses outputs of other modules. This is what I have on my provider configuration:

provider "kubernetes" { host = module.management-cluster-module.host cluster_ca_certificate = module.management-cluster-module.certificate-authority-data exec { api_version = "client.authentication.k8s.io/v1beta1" args = ["eks", "get-token", "--profile","allianz","--cluster-name", module.management-cluster-module.cluster-name] command = "aws" } }

My current main.tf files uses few modules and in some of those modules I used this resource definition:

data "kubectl_file_documents" "secrets_manifest" { content = file("${path.module}/manifests/secrets.yaml") }

resource "kubectl_manifest" "metrics_server" { for_each = data.kubectl_file_documents.secrets_manifest.manifests yaml_body = each.value }

Up to this definition my whole terraform implementation was working fine and without issues using the above described provider configuration. Then, one day I needed to set up some secrets and I decided to use the configuration below. It is using the same "resource "kubectl_manifest"" but the new configuration describe/create the k8s objets in a different way (it is not using a yam file to describe/create the k8s object), it is describing the k8s objects inside the "resource "kubectl_manifest"" as you can see below. This is when I start getting this error: cannot create REST client: no client config.

Secrets to use during Jenkins Installation:

resource "kubernetes_manifest" "jenkins_spc" { manifest = { apiVersion = "secrets-store.csi.x-k8s.io/v1alpha1" kind = "SecretProviderClass" metadata = { namespace = "jenkins" name = "jenkins-secrets" } spec = { provider = "aws" parameters = { objects = yamlencode( [ { objectName = "management/jenkins" objectType = "secretsmanager" jmesPath = [ ... this section was deleted on purpose ] } ]) } secretObjects = [ { ... this section was deleted on purpose } ] } } }

What was my solution? i went back to create my secrets configuration in k8s using resource "kubectl_manifest" but like this:

data "kubectl_file_documents" "secrets_manifest" { content = file("${path.module}/manifests/secrets.yaml") }

resource "kubectl_manifest" "metrics_server" { for_each = data.kubectl_file_documents.secrets_manifest.manifests yaml_body = each.value }

Hope that helps someone with the same issue.

villesau commented 1 year ago

Could this actually indicate that the parameters are defined in a wrong place? Maybe the parameters that are currently passed to provider should actually belong for resolvers instead? It would be more repetitive, but would solve the problem without fundamental changes to Terraform.

brandongallagher1999 commented 1 year ago

We need a feature that allows us to create our Terraform infrastructure from end to end using run-time dependant providers so that I can for example: Create a Kubernetes Cluster then within the same runtime --> Deploy Helm charts into my cluster (Prometheus, Grafana, etc).

This would be game-changing in terms of IaC capabilities. I sometimes question the use-case of IaC beyond creating infrastructure using other module's properties. Since I have to have multiple folders and have to manually run through each module folder and run terraform destroy or terraform apply, even though they're all part of the same app, it makes things extremely tedious as it's manual infrastructure management minus a few steps since I can delete some (not all) dependant resources.

Please HashiCorp!

UncleSamSwiss commented 1 year ago

@brandongallagher1999 It gets even worse when you want to deploy an entire application stack.

We have these completely independent Transform module folders:

Currently the only better way to do this would be to use CDKTF which apparently can manage multiple independent "Stacks" that share properties using Terraform state.

So yes, a "native" solution inside Terraform would really be appreciated!

marziply commented 1 year ago

@brandongallagher1999 It gets even worse when you want to deploy an entire application stack.

We have these completely independent Transform module folders:

  • Kubernetes Cluster
  • Postgres (separate because it uses the Kubernetes Provider)
  • Keycloak (because it uses Postgres and Kubernetes Providers)
  • our application (because it needs Keycloak, Postgres and Kubernetes Providers)

Currently the only better way to do this would be to use CDKTF which apparently can manage multiple independent "Stacks" that share properties using Terraform state.

So yes, a "native" solution inside Terraform would really be appreciated!

I share these sentiments exactly. As a user of Terraform, I want to run terraform apply once. I theoretically should not have to have multiple "stages" for Terraform because surely depends_on should handle dependencies between whatever I configure to be as dependants. I've often found myself in a "chick or the egg" situation with Terraform - I want to install a bunch of Helm packages via the Helm resource but I can't use kubernetes_manifest CRD resources within the same apply because of the Open API spec that's needed first. A frustrating constraint that means I am needing to split my Terraform applies into sequential stages.

We need a feature that allows us to create our Terraform infrastructure from end to end using run-time dependant providers so that I can for example: Create a Kubernetes Cluster then within the same runtime --> Deploy Helm charts into my cluster (Prometheus, Grafana, etc).

This would be game-changing in terms of IaC capabilities. I sometimes question the use-case of IaC beyond creating infrastructure using other module's properties. Since I have to have multiple folders and have to manually run through each module folder and run terraform destroy or terraform apply, even though they're all part of the same app, it makes things extremely tedious as it's manual infrastructure management minus a few steps since I can delete some (not all) dependant resources.

Please HashiCorp!

This mirrors my point exactly. We need support for some level of dependency configuration between providers, especially for read operations as well in the case of Helm. I want to configure Terraform to wait until X resource (eg. Helm packages) have successfully been installed in Kubernetes before attempting to read the CRD Open API spec.

ajostergaard commented 1 year ago

I assume you've all tried using Terragrunt to help with this scenario, what's the reason that didn't work out?

UncleSamSwiss commented 1 year ago

I assume you've all tried using Terragrunt to help with this scenario, what's the reason that didn't work out?

You are right, I didn't know about terragrunt. So we have two community projects that work around this issue: CDKTF and terragrunt. This is great, but it adds to the complexity - even more tools to learn and understand. IMHO it would still be better to have this solved once and for all in Terraform itself.

brandongallagher1999 commented 1 year ago

I assume you've all tried using Terragrunt to help with this scenario, what's the reason that didn't work out?

Terragrunt unfortunately does not solve this use-case either. Not to mention the unnecessary complexity of managing all of the sub .hcl files to allow Terragrunt to function is quite a nuisance. I find Terragrunt is only useful for running things like terragrunt format in multiple sub folders, but beyond that it doesn't improve the underlying functionality of Terraform.

jmturwy commented 1 year ago

Same issue as everyone else: Using terraform to spin up an entire eks eco system, then using a helm provider to deploy the vault chart, then use the vault provider to configure vault. Vaults on the vault provider because it cannot connect to the vault URL since it doesnt exist yet. Works fine if the server is already stood up.

Going to just see if i can configure everything i need in the helm provider.

brandongallagher1999 commented 1 year ago

We need run-time dependant providers. This feature would be groundbreaking in terms of IaC.

Sodki commented 1 year ago

We need run-time dependant providers. This feature would be groundbreaking in terms of IaC.

Not groundbreaking. Other tools like Pulumi have had it for years. Terraform just needs to pick up the pace.

brandongallagher1999 commented 1 year ago

Not groundbreaking. Other tools like Pulumi have had it for years. Terraform just needs to pick up the pace.

Wasn't aware Pulumi had this capability. It's nice to spin up our entire infrastructure and all of it's relative resources within one command. If Pulumi is actually capable of this, I actually might consider moving over. However in terms of the DevOps industry, Terraform still seems to be the most in demand; which concerns me.

brandongallagher1999 commented 1 year ago

@Sodki Follow up from your comment. I have tried out Pulumi to recreate our entire stack. I will NEVER go back to Terraform. The ability to spin up an entire stack (Database, K8s Cluster, K8s resources, Storage Accounts, etc) in a single file (or organized set of folders), where I'd previously have an issue with deploying K8s resources since it required the cluster to be created prior to runtime is no longer an issue.

I'd recommend that everyone here move over to Pulumi since not only does it solve the issue present on this thread, but it also allows you to utilize TypeScript or Python which is extremely convenient, compared to the extreme complexity of HCL when it comes to things beyond static declaration (for loops for example are INSANE in HCL, especially when accessing object properties).

apparentlymart commented 4 months ago

The Terraform team is planning to resolve this problem as part of https://github.com/hashicorp/terraform/issues/30937, by introducing a new capability for a provider to report that it doesn't yet have enough information to complete planning (e.g. something crucial like the API URL isn't known yet), which Terraform Core would then handle by deferring the planning of affected resources until a subsequent plan/apply round.

You could think of this as something similar to automatically adding -target options to exclude the resources that can't be applied yet, although the implementation is quite different because it's still desirable to at least validate the deferred resources, and validation doesn't require a complete provider configuration.

The history of this issue seems to be causing ongoing confusion, since the underlying problem here has nothing to do with dependencies and is instead about unknown values appearing in the provider configuration.

Since there's already work underway to solve this as part of a broader effort to deal with unknown values in inconvenient places, I'm going to close this issue just to consolidate with the other one that has a clearer description of what the problem is and is tracking the ongoing work to deal with it.

Thanks for the discussion here! If you are subscribed to this issue and would like to continue getting notifications about this topic then I'd suggest subscribing to https://github.com/hashicorp/terraform/issues/30937 instead.

github-actions[bot] commented 3 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.