Open WilliamABradley opened 4 years ago
Hi, is this still under the radar ? is there any plan to fix this yet ? thanks
Just to link the issues, see also https://github.com/hashicorp/terraform/issues/19932 for multiple provider instantiation.
can someone from Hashicorp bother himself and answer, Issue is open for a year now!!! Would be good to fix to answer those RFC in terraform before announcing new product capabilities
Indeed. Without a workaround feature, this is a fundamental weakness of making post-auth information like region conventionally part of the provider rather than a more normal input to the resource, and as such, terraform becomes unreasonably verbose for expressing non-trivial architectures.
This comment may help people here as well: https://github.com/hashicorp/terraform/issues/29840#issuecomment-974999903
Again, not tested for more than one, but I think it should just work.
Another use-case many practitioners have is for_each on modules with providers. The most recent use-case:
base/databricks-common
is initializing databricks provider via host & token combinations, that are outputs from either base/aws
or base/azure
.aws-classroom
includes base/aws
& base/databricks-common
azure-classroom
includes base/azure
& base/databricks-common
└── modules
├── aws-classroom
│ └── main.tf
├── azure-classroom
│ └── main.tf
└── base
├── aws
│ ├── main.tf
│ └── provider.tf
├── aws-shared
│ ├── main.tf
│ └── provider.tf
├── azure
│ └── main.tf
├── databricks-common
│ ├── main.tf
│ ├── policy.tf
│ ├── provider.tf
│ └── users.tf
└── defaults
└── main.tf
Let's simplify the desired outcome of this feature:
data "http" "classrooms" {
url = "https://internal-ws-based-on-non-tech-users-input/classrooms.json"
}
module "classrooms" {
for_each = data.http.classrooms.value
source = "./modules/classroom"
name = each.value.name
}
resource "aws_s3_bucket_object" "all_classrooms" {
bucket = "..."
key = "classrooms.json"
content_base64 = base64(json_encode([for c in module.classrooms: {
"name" = c.name,
"url" = c.url
}]))
}
so that we just leave the auto-approve to terraform and run it as background job in Terraform Cloud or anywhere else every 30 minutes, having no bothering to change the HCL files through GIT every time we need to add a new classroom. But this is not possible because of this bug.
some folks have solved the similar problem with code-generating the classroom equivalent modules via bash script, but this cannot really be done as a single-state apply... theoretically we can hack this around via github_repository_file, and use data.http.classrooms.value
to generate & change classrooms.tf
file that would have nothing but hardcoded module declarations, but it's still a hack.
@jbardin any updates on the roadmap for this feature?
@nfx, from what I understand this feature is very difficult to implement, because it would require major changes to the way Terraform works internally. With the basic idea that providers must exist and be setup before "running".
But I agree that something like a dynamic provider would be great.
Another common use case is one Terraform state that manages both a AWS EKS Kubernetes cluster with the AWS provider and Kubernetes resources with the Kubernetes provider.
Currently there is basically no straight-forward way to do both things in the same state. The official recommendation is to split the creation of the cluster and the management of cluster internals into two states.
@trallnag of course the feature might take time, so i'm asking for timelines here. those workarounds are not acceptable in the long term =)
Any news about it? I'm facing the same issue
could everyone with Hashicorp contact/subscription try reaching up? probably it's the only way to prioritize this.
Yeah def annoying, I'd love to get an update on this too
It is really good to have this kind of feature in terraform DSL. However this can be achieved (to some extent) if using terraform CDK. But terraform CDK cannot be used in production yet. :(
However this can be achieved (to some extent) if using terraform CDK
@spareslant, could you expand on this?
@trallnag, In TerraformCDK you can write terraform code in a proper programming language like python. With this, you can mask some of the code in if-then-else. I used terraform stacks in terraformCDK. First stack runs using usual default provider (credentials), but it generates config that can be used to run other stacks.
So in first run of TerraformCDK it shows and runs only initial one stack. (other stacks will not be visible till now due to conditionals in place. Moment first stack finishes, it has created a new config which can be passed onto other remaining stacks.
I tested this in OCI (Oracle Cloud Infrastructure). I ran first stack with default credentials (default provider), It created new compartment, new user and new api-keys and created policy to allow this new user to deploy rest of the stuff. Remaining stacks then used this new credentials(wrapped in new provider) to deploy the rest of the infrastructure.
It was not in one command run, I had to run terraform two times. It is not what you might be looking for, but achieved some purpose for my testing.
You can check the code here: This is what I wanted to say in code: https://github.com/spareslant/oci_multi_stack_terraform_cdk_python_v2/blob/main/main.py#L21 https://github.com/spareslant/oci_multi_stack_terraform_cdk_python_v2/blob/main/network.py#L40
And README: https://github.com/spareslant/oci_multi_stack_terraform_cdk_python_v2#readme
@spareslant, ah okay, thanks for the details. So it's using several stacks / layers / states just like one would in vanilla Terraform. Just wrapped in one program / script.
I've done a little more research into what it might take to support this today. There are some significant architectural challenges to overcome before we could support dynamically associating a provider configuration with a resource, rather than treating that as a static concern before expression evaluation.
The key challenges I've identified in my initial (failed) prototyping spike are:
Our state model tracks provider configuration address as a per-resource value rather than a per-resource-instance value, which means that the assumption that all instances of a particular resource
or data
block belong to the same provider is baked into the state snapshot format. We'd need to revise the state snapshot format in a forward-incompatible way (older versions of Terraform would not be able to decode the new format) to change that.
In principle we could make the format change only in the new situation where not all of the instances of a resource have the same provider configuration address, which would at least then mean the compatibility problem wouldn't arise until the first time a configuration creates that situation.
The provider
meta-argument in resource
and data
blocks currently statically declares which provider configuration to use, and because each provider configuration is statically associated with a particular provider it also implies which provider the resource belongs to. ("Provider" means e.g. hashicorp/aws
, while "Provider configuration" represents a particular provider "aws"
block.)
Although we could in principle support the provider configuration varying between instances of a resource, I believe it's still necessary to statically determine the provider itself so that e.g. terraform init
can determine which providers need to be installed, various parts of Terraform can know which schemas they are supposed to use for static validation, etc.
This means that if we change the provider
meta-argument to take an arbitrary expression which returns a provider configuration then we will need to find some other way to separately declare the expected provider, so that we can determine the provider statically and then the specific configuration dynamically later.
One possible strawman language design would be to support a new syntax in provider
like provider = aws(var.example)
, where the aws
portion is a static reference to a provider local name (one of the keys in the required_providers
block) and the argument is an arbitrary expression that returns an object representing a provider configuration for that provider.
In my work so far I've been assuming that the goal would be for provider configurations to be a new kind of value in the Terraform language, with a new type kind so that each provider has its own provider configuration data type. That could then for example allow passing provider configurations in as part of a larger data structure, rather than them always having to enter a module through the providers
sidecar channel:
terraform {
required_providers {
# References to "aws" below refer to this provider.
aws = {
source = "hashicorp/aws"
}
}
}
variable "example" {
type = map(object({
instance_type = string
provider = providerconfig(aws)
}))
}
resource "aws_instance" "example" {
for_each = var.example
instance_type = each.value.instance_type
provider = aws(each.value.provider)
}
My investigation so far suggests that it's relatively straightforward to define a new data type representing a provider configuration, but it's unclear how to reconcile that with the existing provider-configuration-specific side-channels of either implicit provider inheritance or explicit provider passing via the providers
meta-argument inside a module
block. In particular, it isn't clear to me yet how it would look to have a configuration with a mixture of old-style and new-style modules, where some modules are still expecting the old mechanisms for passing providers while others want them to arrive via normal values in input variables.
Passing providers around as normal values comes with the nice benefit that our usual expression-analysis-based dependency graph building approach can "just work" without any special cases, as long as there's a new expression-level syntax for referring to a provider
block defined in the current module.
However, Terraform's current special cases for dealing with providers statically seem likely to come into conflict with this new generalized form if we try to leave them both implemented together. There's various provider-configuration-specific logic for automatically assuming empty configuration blocks for providers that have no required configuration arguments, for inserting "proxies" to model the implicit or explicit passing of configurations from parent to child module, etc.
When I prototyped I just deleted all of that stuff because I had the luxury of not having to be backward-compatible, but I doubt that approach will succeed in a real implementation. We'll need to define exactly how the dependency graph ought to be built when a configuration contains a mixture of both traditional static and new-style dynamic provider references.
My focus here was on the problem of dynamically assigning existing provider configurations to individual resource instances. Although it's thematically connected, from a design and implementation standpoint that's actually pretty separate from the other desire to dynamically define provider configurations (e.g. #19932), and so I've not considered that here. That desire has a largely-unrelated set of challenges that involve the same problem that makes providers declared inside child modules not work well today: a provider configuration always needs to outlive all of the resources it's managing so that it can be around to destroy them, whereas declaring them dynamically makes it quite likely that both the provider configuration and the resource would be removed systematically together. If anyone would like to investigate that set of problems, I suggest discussing that over in #19932 instead so that we can keep the two research paths distinct from one another.
I've done a little more research into what it might take to support this today. There are some significant architectural challenges to overcome before we could support dynamically associating a provider configuration with a resource, rather than treating that as a static concern before expression evaluation.
The key challenges I've identified in my initial (failed) prototyping spike are:
- Our state model tracks provider configuration address as a per-resource value rather than a per-resource-instance value, which means that the assumption that all instances of a particular
resource
ordata
block belong to the same provider is baked into the state snapshot format. We'd need to revise the state snapshot format in a forward-incompatible way (older versions of Terraform would not be able to decode the new format) to change that. In principle we could make the format change only in the new situation where not all of the instances of a resource have the same provider configuration address, which would at least then mean the compatibility problem wouldn't arise until the first time a configuration creates that situation.- The
provider
meta-argument inresource
anddata
blocks currently statically declares which provider configuration to use, and because each provider configuration is statically associated with a particular provider it also implies which provider the resource belongs to. ("Provider" means e.g.hashicorp/aws
, while "Provider configuration" represents a particularprovider "aws"
block.) Although we could in principle support the provider configuration varying between instances of a resource, I believe it's still necessary to statically determine the provider itself so that e.g.terraform init
can determine which providers need to be installed, various parts of Terraform can know which schemas they are supposed to use for static validation, etc. This means that if we change theprovider
meta-argument to take an arbitrary expression which returns a provider configuration then we will need to find some other way to separately declare the expected provider, so that we can determine the provider statically and then the specific configuration dynamically later. One possible strawman language design would be to support a new syntax inprovider
likeprovider = aws(var.example)
, where theaws
portion is a static reference to a provider local name (one of the keys in therequired_providers
block) and the argument is an arbitrary expression that returns an object representing a provider configuration for that provider.In my work so far I've been assuming that the goal would be for provider configurations to be a new kind of value in the Terraform language, with a new type kind so that each provider has its own provider configuration data type. That could then for example allow passing provider configurations in as part of a larger data structure, rather than them always having to enter a module through the
providers
sidecar channel:terraform { required_providers { # References to "aws" below refer to this provider. aws = { source = "hashicorp/aws" } } } variable "example" { type = map(object({ instance_type = string provider = providerconfig(aws) })) } resource "aws_instance" "example" { for_each = var.example instance_type = each.value.instance_type provider = aws(each.value.provider) }
My investigation so far suggests that it's relatively straightforward to define a new data type representing a provider configuration, but it's unclear how to reconcile that with the existing provider-configuration-specific side-channels of either implicit provider inheritance or explicit provider passing via the
providers
meta-argument inside amodule
block. In particular, it isn't clear to me yet how it would look to have a configuration with a mixture of old-style and new-style modules, where some modules are still expecting the old mechanisms for passing providers while others want them to arrive via normal values in input variables.- Passing providers around as normal values comes with the nice benefit that our usual expression-analysis-based dependency graph building approach can "just work" without any special cases, as long as there's a new expression-level syntax for referring to a
provider
block defined in the current module. However, Terraform's current special cases for dealing with providers statically seem likely to come into conflict with this new generalized form if we try to leave them both implemented together. There's various provider-configuration-specific logic for automatically assuming empty configuration blocks for providers that have no required configuration arguments, for inserting "proxies" to model the implicit or explicit passing of configurations from parent to child module, etc. When I prototyped I just deleted all of that stuff because I had the luxury of not having to be backward-compatible, but I doubt that approach will succeed in a real implementation. We'll need to define exactly how the dependency graph ought to be built when a configuration contains a mixture of both traditional static and new-style dynamic provider references.My focus here was on the problem of dynamically assigning existing provider configurations to individual resource instances. Although it's thematically connected, from a design and implementation standpoint that's actually pretty separate from the other desire to dynamically define provider configurations (e.g. #19932), and so I've not considered that here. That desire has a largely-unrelated set of challenges that involve the same problem that makes providers declared inside child modules not work well today: a provider configuration always needs to outlive all of the resources it's managing so that it can be around to destroy them, whereas declaring them dynamically makes it quite likely that both the provider configuration and the resource would be removed systematically together. If anyone would like to investigate that set of problems, I suggest discussing that over in #19932 instead so that we can keep the two research paths distinct from one another.
Are we saying that in Terraform, the "provider" identified to install when running terraform init
is treated as different when having a different configuration? e.i : aws region1 vs aws regions2?
What would be the implications of allowing the region to be defined/overridden at the module level?
What is the official suggested way for doing multi-region deployments with hashicorp/aws provider? (It's clear the workaround is one module per region, losing all dynamics in the terraform app).
As much as I love terraform, this issue pushes me to leverage CloudFormation.
One use case is Mono terraform deployment:
Multiple Regions AKS per Region Helm Provider per AKS
## ! │ The module at ... is a legacy module which contains its own local
## ! │ provider configurations, and so calls to it may not use the count, for_each, or depends_on arguments.
## ! │
## ! │ If you also control the module "...", consider updating this module to
## ! │ instead expect provider configurations to be passed by its caller.
## ! ╵
# provider "helm" {
# kubernetes {
# host = var.kubernetes_cluster.host
# client_certificate = base64decode(var.kubernetes_cluster.client_certificate)
# client_key = base64decode(var.kubernetes_cluster.client_key)
# cluster_ca_certificate = base64decode(var.kubernetes_cluster.cluster_ca_certificate)
# }
# }
are there any timelines for implementing this, @apparentlymart ?
Same here, Dynamic EKS configuration so I need a dynamic provider to apply helm.
One use case is Mono terraform deployment:
Multiple Regions AKS per Region Helm Provider per AKS
## ! │ The module at ... is a legacy module which contains its own local ## ! │ provider configurations, and so calls to it may not use the count, for_each, or depends_on arguments. ## ! │ ## ! │ If you also control the module "...", consider updating this module to ## ! │ instead expect provider configurations to be passed by its caller. ## ! ╵ # provider "helm" { # kubernetes { # host = var.kubernetes_cluster.host # client_certificate = base64decode(var.kubernetes_cluster.client_certificate) # client_key = base64decode(var.kubernetes_cluster.client_key) # cluster_ca_certificate = base64decode(var.kubernetes_cluster.cluster_ca_certificate) # } # }
Hello @apparentlymart , any update about this issue? we're facing it too, thanks in advance
Guess this will never happen :-(
It may not help much, but for Azure specifically, and where the intent is to e.g. for_each through a module and pass in the target deployment scope (Subscription or Subscription and resource group) per instance, rather than duplicating code (which get's messy with hundreds of subscriptions), there is the option of using the Azure AzApi provider as I've provided an example of here: https://github.com/kahawai-sre/azapi-nsgs-demo In case it helps anyone in that context in the meantime ... That said, using AzureRM and having the capability to pass the provider dynamically would of course be a far better solution for many reasons as things stand, so I'm still holding out for that ... please :-)
Seriously? Two years and a feature which would be native to most DSLs and the most basic programming language is not present? At this point I might as well consider pulumi.
we are noticing some resources allow provider configuration to overwrite right on the resource level, which seems like a convenient option:
https://registry.terraform.io/providers/TelkomIndonesia/linux/latest/docs#provider-override
we are noticing some resources allow provider configuration to overwrite right on the resource level, which seems like a convenient option:
https://registry.terraform.io/providers/TelkomIndonesia/linux/latest/docs#provider-override
That would be a nice workaround!
My focus here was on the problem of dynamically assigning existing provider configurations to individual resource instances.
This would almost completely solve the issue, at least for many people. A lot of people already generate the providers configuration one way or another and it wouldn't be much of a problem to generate this list. The crux of the problem is that you also need to generate the entire module
block using the provider alias which is far less appealing.
I completely understand the concerns around removing the provider before resources are destroyed and understand the design decisions around that. I've run into the issue on older versions of Terraform and it's definitely not fun to debug (if you don't know the root cause). But giving us some way to pass a provider alias without generating a module
block and all of it's configuration would at least give us many ways to work around the issue.
I'm going to try overrides.tf
files and hope there isn't some restriction that prevents them from operating on provider aliases, that might be workable.
Any progress?
This issue has been open for quite some time.. At least give us an provider override.
Wanted to add a use case here -- I'll be looking through the workarounds listed here to see if any of them will work.
We use the SAP BTP Terraform provider. It can create CloudFoundry environments, which we then use the CloudFoundry provider to work with.
The challenge: We don't have control over the CF API URL that SAP BTP will generate. The URL might be us10
, or us10-001
, for example. SAP BTP at least helpfully outputs the API URL they associate the environment with, so I can access it programmatically.
The challenge comes when wanting to now use that API URL along with my CloudFoundry provider block.
❌ I can't use a separate provider block in the module pass the API in a variable, because that makes this a legacy module that can't use count
, etc. which I need in my module.
❌ I can't override
❌ I can't create aliases and programmatically choose an alias (e.g. with a ternary operator in the providers that are passed in)
❌ I can't choose a provider within the module either.
So my only options (so far; still reading this thread) seem to be:
Wanted to add a use case here -- I'll be looking through the workarounds listed here to see if any of them will work.
We use the SAP BTP Terraform provider. It can create CloudFoundry environments, which we then use the CloudFoundry provider to work with.
The challenge: We don't have control over the CF API URL that SAP BTP will generate. The URL might be
us10
, orus10-001
, for example. SAP BTP at least helpfully outputs the API URL they associate the environment with, so I can access it programmatically.The challenge comes when wanting to now use that API URL along with my CloudFoundry provider block.
❌ I can't use a separate provider block in the module pass the API in a variable, because that makes this a legacy module that can't use
count
, etc. which I need in my module. ❌ I can't override ❌ I can't create aliases and programmatically choose an alias (e.g. with a ternary operator in the providers that are passed in) ❌ I can't choose a provider within the module either.So my only options (so far; still reading this thread) seem to be:
- Get the CloudFoundry module to support some overrides, e.g. for API URL.
- Split my Terraform into two Terraforms and two states, one for the stuff prior to CF and one for the stuff after. To me this reduces the value proposition of Terraform and "feels wrong"
- ...there doesn't seem to be an option 3.
The 3rd option is to write a preprocessor stage that generates the appropriate override.tf
files on the fly before your code runs. This makes use of the little advertised (for good reason) feature here: https://developer.hashicorp.com/terraform/language/files/override
The way we do it is like this:
override.py
or *_override.py
, it will execute the python scriptoverride.tf
files to make it workThere are still a bunch of downsides. For example depending on this feature https://developer.hashicorp.com/terraform/language/files/override) is non-intuitive and feels brittle. I dislike preprocessor steps in general for this reason. Your IDE will have no idea these files will exist and static analysis tools won't be aware of them either. You still can't loop over anything, you must define individual module
calls and override count
to enable/disable things as needed. There is some code duplication.
I do prefer this to generating actual TF code via another lnaugage because you can at least add some comments explaining what is happening, and it's not like more resources are popping into existence out of no where. But this mechanism is arbitrarily powerful and can solve a few unsolved problems related to limitations in terraform. For example we already use this to override module sources (which cannot be variablized) in restricted environments where the module source URL is different. It's a bit hacky, relying on a regex to directly parse the TF files, but it's been extremely reliable.
Of course, every home-grown implementation of terraform workarounds seems to eventually resemble Terragrunt, so that might be something you want to check out. My team was not familiar enough with Terraform itself to take that on ~2 years ago, so we're using vanilla terraform with separate scripts to generate backend configs, provider configs, as well as some other simple configuration scripts. I did try to make things relatively consistent with Terragrunt so if they want to migrate later they can, but it's probably worth looking into if you're on a smaller team / company or working by yourself.
Hello, provisioning a resource and then requiring a provider to configure said resource is not an uncommon pattern. For example, creating a Databricks workspace and then using its ID to configure the Databricks provider. I was surprised to learn today that this is not supported. This is a glaring oversight. I'm shocked that this is still unresolved after 4 years.
Looks like ToFu working on this
An example of a place you may want to do this is where you have a single/uncomplicated resource like an "aws_iam_account_password_policy" and you want to set that across 20+ accounts. It's a pain in the butt defining each one individually.
I would love Terraform to support this without third-party tools.
This is going to be a critical feature for managing multiple child accounts automatically. Since we currently lack it, we are reconsidering our tooling choices and looking forward to using AWS StackSet to achieve this goal. We'll be happy to use Terraform for that, though.
Opentofu 1.8 just got released with early evalution of variables and locals. This enables dynamic module source expressions. dynamic provider support is scheduled for 1.9. Let's see if they can do some magic.
Does this mean if I have say a Kafka prodvider in a Kafka-topics module I could define an empty provider def in the root, and then in the module have an override for the provider for th exact broker/cluster they wanted to add a topic to? Or would I still error when planning because the provider config was changed by a module. And if only define it in the module resource cleanup is a two step issue?
I am dealing with NG with these issues for providers to manage MySQL and Kafka users that are a root->mysql_cluster_module->mysql_provider->mysql_user.module->mysql_user_resource.
I would love a pathway let each module call connect to its own server and know how to clean itself up
Right now I need a plan to remove users, then another to remove the whole module call.
Current Terraform Version
Use-cases
It would be nice to be able to create dynamic providers. The main reason for my usage would be for aws assume_role. I have Terraform to create a number of AWS Subaccounts, and then I want to configure those subaccounts in one apply, instead of breaking them up across multiple apply steps.
Currently this is done via modules, but with 0.12, I had to manually define copies of the modules for each subaccount.
As said by the 0.13 modules doc: https://github.com/hashicorp/terraform/blob/master/website/docs/configuration/modules.html.md#limitations-when-using-module-expansion
Attempted Solutions
Source for
./modules/organisation-group
:Source for
./modules/organisation-account-config
:Proposal
Module
for_each
:Provider
for_each
:This doesn't look as clean, but appeases the docs saying:
References