Open FernandoMiguel opened 2 years ago
Thanks for reporting this, @FernandoMiguel!
At first glance this seems like a correct error: your conditional expression is indeed defined to produce a set in one case but an empty object in another, and those types are not compatible.
Therefore I think our first step here would be to understand why this didn't return an error in v1.1.9. It suggests that there was some sort of bug in that previous version that is now fixed, so I think we will need to understand what that was to see what is a reasonable path forward here. We'll take a closer look at this soon.
Thanks again!
This one was a pretty interesting one to create a minimal reproduction for, because it seems that it's crucial that the set itself be unknown and of unknown type, leading to what this error message clumsily refers to as a "set of dynamic".
Therefore my reproduction case is pretty contrived and relies on some tricks to synthesize a similar result without relying on a Kubernetes server:
resource "null_resource" "example" {
}
locals {
list = null_resource.example.id[*]
set = toset(local.list)
pred = null_resource.example.id == "well well well"
maybe = local.pred ? local.set : {}
}
Some notes about what's going on here:
null_resource.example.id
is an unknown value of type string.null_resource.example.id == "well well well"
returns an unknown value of type bool, because we can see that the ID is a string but we cannot yet determine whether it equals "well well well"
.null_resource.example.id[*]
returns an unknown value of an unknown type, because the splat operator needs to know whether the ID will be null before it can determine what its result type will be.toset(local.list)
produces an unknown set of an unknown type due to being derived from an unknown value of an unknown type. (Terraform optimistically assumes that the type conversion will succeed during planning, under our usual rule that during apply time all values must either be compatible with the assumptions we made or return an error explaining why not).With all of that said then, this sufficiently recreates the relevant situation to produce an equivalent error on Terraform v1.2.0-rc2:
╷
│ Error: Inconsistent conditional result types
│
│ on conditional-set-and-object.tf line 9, in locals:
│ 9: maybe = local.pred ? local.set : {}
│ ├────────────────
│ │ local.pred is a bool, known only after apply
│ │ local.set is a set of dynamic, known only after apply
│
│ The true and false result expressions must have consistent types. The
│ 'true' value is set of dynamic, but the 'false' value is object.
╵
On Terraform v1.1.9 it succeeds during planning and during the subsequent apply step:
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# null_resource.example will be created
+ resource "null_resource" "example" {
+ id = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
null_resource.example: Creating...
null_resource.example: Creation complete after 0s [id=2603811441306929907]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
We do typically consider it an improvement if an error that was previously detected only during apply were now detected during plan instead due to Terraform's analysis being more precise, but this one is concerning because somehow Terraform v1.1.9 wasn't even detecting this problem during the apply step, at which point Terraform should be able to clearly see that the true and false arms of the conditional are of incompatible types.
I don't yet know why Terraform v1.1.9 wasn't detecting this error, but hopefully the simpler reproduction case above will make it easier to dig in and get to the bottom of it.
One further thing I noticed is that even though the first terraform apply
somehow ran to completion without errors with Terraform v1.1.9, if I immediately run terraform plan
with Terraform v1.1.9 afterwards (so that there's now a prior state with null_resource.example
in it) then it does fail as I would've expected:
null_resource.example: Refreshing state... [id=9049847589891101533]
╷
│ Error: Inconsistent conditional result types
│
│ on conditional-set-and-object.tf line 9, in locals:
│ 9: maybe = local.pred ? local.set : {}
│ ├────────────────
│ │ local.pred is false
│ │ local.set is set of string with 1 element
│
│ The true and false result expressions must have consistent types. The
│ given expressions are set of string and object, respectively.
╵
(Notice that this is the old-style "Inconsistent conditional result types" message, where says "are set of string and object, respectively" instead of "The 'true' value is set of string, but the false value is object" as v1.2 would've produced.)
It is not clear to me at the moment why terraform apply
failed to catch this error even though the subsequent plan caught it, given that the evaluation of the local value during apply ought to have been working with the same final state of null_resource.example
as the subsequent plan is. :thinking:
(The "Refreshing state..." message did cause me to wonder if the refresh step was the key here, but sadly there's no difference in behavior when I do the subsequent plan with -refresh=false
.)
The change which resulted in this new behaviour is in #30879, where we corrected the conversion functions to cope with values of unknown type. This means that calling toset(…)
with arguments of unknown type will now correctly return a set(dynamic)
value, rather than the more general dynamic
.
I've verified this by cherry-picking the commit from #30879 onto v1.1, which results in the same behaviour as the 1.2 branch.
To achieve the goal I think you're aiming for, the for_each
expression here should be updated:
for_each = (
var.fargate_create == true && var.otel_fargate_container_insights_enabled == true ?
toset(data.kubectl_path_documents.otel_fargate_container_insights[0].documents) :
toset([])
)
That is, changing {}
to toset([])
. This should work as expected on both the 1.1 and 1.2 releases.
Based on our reproduction, we don't currently believe this is breaking previously-valid behaviour, as while the first apply
succeeds, a subsequent plan
will (correctly) error. If this is not the case, and you are relying on the previously buggy behaviour, please let us know more and we'll think further about how to move forward.
Thanks again for testing the release candidate and reporting the issue!
From looking closely at the TRACE logs for the apply step, I can see why terraform apply
didn't catch this in v1.1.9:
2022-05-12T11:51:32.980-0700 [DEBUG] pruneUnusedNodes: local.maybe (expand) is no longer needed, removing
2022-05-12T11:51:32.980-0700 [DEBUG] pruneUnusedNodes: local.set (expand) is no longer needed, removing
2022-05-12T11:51:32.980-0700 [DEBUG] pruneUnusedNodes: local.pred (expand) is no longer needed, removing
2022-05-12T11:51:32.980-0700 [DEBUG] pruneUnusedNodes: local.list (expand) is no longer needed, removing
Since I used local values for my reproduction, that made them subject to pruning during the apply step and therefore Terraform didn't evaluate them at all. I expect the same would not apply to a resource for_each
argument, because those lead to a side-effect and so cannot be safely pruned in this way.
(It's interesting to note that this pruning does appear to allow invalid local values to go undetected as long as they remain unknown until the apply step, which is concerning now that I see it, but is a different concern from what this issue is discussing. An unknown value that never gets used is an edge case anyway, so it's not super concerning but still something we might want to revisit and consider.)
I did try to reproduce this with for_each
instead of a local value but ended up blocked by the usual error that for_each
itself cannot be unknown:
╷
│ Error: Invalid for_each argument
│
│ on conditional-set-and-object.tf line 14, in resource "null_resource" "another":
│ 14: for_each = local.maybe
│ ├────────────────
│ │ local.maybe will be known only after apply
│
│ The "for_each" value depends on resource attributes that cannot be
│ determined until apply, so Terraform cannot predict how many instances
│ will be created. To work around this, use the -target argument to first
│ apply only the resources that the for_each depends on.
╵
I find myself unsure about how to write a reproduction that involves for_each
here. Even in the original report we can see that the conditional predicate was based on unknown values:
│ on .terraform/modules/base_system/otel-fargate-container-insights.tf line 15, in resource "kubectl_manifest" "otel_fargate_container_insights":
│ 15: for_each = var.fargate_create == true && var.otel_fargate_container_insights_enabled == true ? toset(data.kubectl_path_documents.otel_fargate_container_insights[0].documents) : {}
│ ├────────────────
│ │ data.kubectl_path_documents.otel_fargate_container_insights[0].documents will be known only after apply
│ │ var.fargate_create is a bool, known only after apply
│ │ var.otel_fargate_container_insights_enabled is a bool, known only after apply
...and so it seems like if the condition hadn't failed type checking then the for_each
ought to have failed immediately anyway, because the result of a conditional with an unknown predicate is an unknown value of the conditional result type. (This is, incidentally, why Terraform requires the true and false result types to be the same.)
@FernandoMiguel, when you tried this with Terraform v1.1.9, did you do so in the same context as you tried v1.2.0-rc2? In particular, I'm wondering if you tried v1.2.0-rc2 in a situation where there was no prior state but you tried Terraform v1.1.9 in a situation where there was already a prior state, and therefore Terraform was able to decide a known answer for var.fargate_create == true && var.otel_fargate_container_insights_enabled == true
in that case.
@FernandoMiguel, when you tried this with Terraform v1.1.9, did you do so in the same context as you tried v1.2.0-rc2? In particular, I'm wondering if you tried v1.2.0-rc2 in a situation where there was no prior state but you tried Terraform v1.1.9 in a situation where there was already a prior state, and therefore Terraform was able to decide a known answer for
var.fargate_create == true && var.otel_fargate_container_insights_enabled == true
in that case.
all of my plans/applies were of real code with already existing states, and just by switching tfenv
version to try to debug the issue between versions
After talking with @apparentlymart in slack , I ended up going with
for_each = toset(var.karpenter-enabled == true ? toset(data.kubectl_path_documents.karpenter_provisioners[0].documents) : [])
which feels far more readable than doing
: toset([])
Still, I expect this change of behaviour to impact in some manner a large set of codebases that did not cast type prior.
although looking at https://github.com/search?l=&p=3&q=for_each++%3D+var.+%3A+%7B%7D+language%3AHCL+-dynamic+-toset&ref=advsearch&type=Code , that may not prove true my statement... oh well
@FernandoMiguel do you have a sense of why var.fargate_create
was unknown when you tried with v1.2.0-rc2 but apparently known when you tried with v1.1.9? I'm assuming that must be the case because otherwise the plan would have failed with a different error under the older version. If you're not sure, it could perhaps help to see what expression you assigned to that variable in the calling module, so that I might at least try to write something comparable myself.
I'm trying to find a realistic reproduction of what you saw that I can run on my system (without access to your Kubernetes 😀) and it seems like exactly how you were running Terraform in each case may have made a difference here, but I'm not sure exactly what differed.
Thanks!
this is the code for it
variable "fargate_create" {
description = "Controls if fargate resources should be created"
type = bool
default = false
nullable = false
}
I dont recall now, but there is a slight chance it was missing in the downstream module, and was added in between runs. hard to say now, since all was done locally and without commits
also, using nullable
in case that has any impact
Hi @FernandoMiguel! Thanks for sharing that.
I'm sorry I wasn't clear about what I was asking; I was actually hoping to see the assignment of an expression to that variable in the module
block in the calling module -- the line like fargate_create = ...
where ...
is some expression to decide the value -- because I'd like to understand why exactly it was showing as unknown in the error message you shared back at the top of this discussion, so that I can try to recreate a comparable situation locally.
main.tf
variable "fargate_create" {
description = "Controls if fargate resources should be created"
type = bool
default = false
}
module "base_system" {
source = "/Users/fernando/work/TF-Modules/tf_modules_eks_base_system"
vpc_tag = var.vpc_tag
cluster_name = var.cluster_name
fargate_create = var.fargate_create
eks_managed_node_groups_create = var.eks_managed_node_groups_create
[...]
dev.tfvars
fargate_create = true
[...]
hth
Thanks for that extra context, @FernandoMiguel!
I'm not yet understanding how "var.fargate_create
is a bool, known only after apply" in the original error message if it's just set to a fixed value in .tfvars
like that. It seems like something else is going on here but I've not yet mustered enough imagination to know what to ask to find out what it is.
It does seem though like whatever is going on here is originating somewhere different than where the error message is appearing. Something elsewhere in your configuration seems to be causing Terraform to never evaluate this incorrect expression before, but it's now correctly validating it in v1.2.0. I don't think we yet have enough information to pursue a fix here, since it isn't clear where the problem is, but hopefully someone else will encounter a similar problem soon enough and maybe we'll have a second example to compare with and see what you both have in common, and thus hopefully narrow it down enough that we can have another attempt at a realistic reproduction case.
Thanks again!
to get around issues like this, you can re-write the expression to be a comprehension rather than a ternary
(Note: we can shorten var.fargate_create == true
to simply var.fargate_create
)
Ternary form:
for_each = var.fargate_create && var.otel_fargate_container_insights_enabled ? toset(data.kubectl_path_documents.otel_fargate_container_insights[0].documents) : {}
Comprehension form:
for_each = {for k,v in toset(data.kubectl_path_documents.otel_fargate_container_insights[0].documents : k => v if var.fargate_create && var.otel_fargate_container_insights_enabled}
Note: I did not test the comprehension form but it looks like you might encounter a secondary error that would raise a The "for_each" value depends on resource attributes that cannot be determined until apply
In my opinion this validation seems very much like an anti-feature. It is yet another thing in Terraform which makes handling conditional values harder, and conditional handling in Terraform really is terrible already.
For instance I have a scenario where I conditionally try to set up replication with the https://registry.terraform.io/modules/terraform-aws-modules/s3-bucket/aws/latest module.
If I have a value that looks like this:
replication_configuration = var.replication ? {
role = aws_iam_role.replication[0].arn
rules = [{ [...] }]
} : null
This only works if replication is true, if var.replication
is false, I get an error: Invalid value for "inputMap" parameter: argument must not be null.
.
So, let's have the false value be the same as the modules default value, which is: {}
. Well then it fails because apparently Terraform does a deep comparison of the keys, resulting in the error: The 'true' value includes object attribute "role", which is absent in the 'false' value.
What makes it even worse is that the check is only performed if the conditional returns true. If I set var.replication
to false it will not complain with false value set to {}
. I do find it a bit funny that the Consistent conditional enforcement is in itself inconsistent.
Terraform version:
Terraform v1.3.9
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v4.59.0
@albgus
replication_configuration = {for k,v in {
role = aws_iam_role.replication[0].arn
rules = [{ [...] }]
} : k => v if var.replication }
We hit this issue regularly, and I saw the waiting for reproduction
label on this issue, so here's a simple way to reproduce this:
locals {
environment = "dev"
services_with_postgresql = local.environment == "dev" ? {
tools = {}
big-tools = {
shape = "PostgreSQL.VM.Standard.E4.Flex.4.64GB"
configuration = {
autovacuum = 0
effective_io_concurrency = 1
}
}
} : {}
}
resource "terraform_data" "cluster" {
for_each = local.services_with_postgresql
input = {
name = each.key
shape = lookup(each.value, "shape", "PostgreSQL.VM.Standard.E4.Flex.2.32GB")
}
}
Running terraform validate:
╷
│ Error: Inconsistent conditional result types
│
│ on test.tf line 3, in locals:
│ 3: services_with_postgresql = local.environment == "dev" ? {
│ 4: tools = {}
│ 5: big-tools = {
│ 6: shape = "PostgreSQL.VM.Standard.E4.Flex.4.64GB"
│ 7: configuration = {
│ 8: autovacuum = 0
│ 9: effective_io_concurrency = 1
│ 10: }
│ 11: }
│ 12: } : {}
│ ├────────────────
│ │ local.environment is "dev"
│
│ The true and false result expressions must have consistent types. The 'true' value includes object attribute "big-tools", which is absent in the 'false' value.
╵
I'm using:
Terraform v1.6.5
on darwin_arm64
Hope this helps.
Sorry in advance for jumping in the bandwagon, but I tend to agree that conditional handling experience in terraform is quite bad. I'd say maybe it's time to rethink some design decisions there. I think these type of problems arise because terraform blends two concepts that are antagonic:
Take the example below, without applying dirty hacks, problems like the below happen all the time:
Error: Inconsistent conditional result types
on ../../../main.tf line 27, in locals: 27: pe_subnet = bool_expr > 0 ? azurerm_subnet.subnet1 : azurerm_subnet.subnet2 ├──────────────── │ azurerm_subnet.subnet1 is object with 13 attributes │ azurerm_subnet.subnet2 is object with 14 attributes
The true and false result expressions must have consistent types. The 'true' value includes object attribute "delegation", which is absent in the 'false' value.
I understand that this can prevent certain errors (such as a reference to the missing field), but that's no the right way to prevent a problem. The paradigm mix is.
To make problem worse, there is no simple way to intersect these objects. You have to project one based on the other, just to make it run.
If we want to behave like javascript, then this needs to be a run-time/apply-time error, if applicable. In this case, wouldn't necessarily be so.
As soon as a boolean expression encounters a false evaluation, it should not continue evaluating the rest of the expression. I should be able to write this:
expr = local.value != null && local.value.attribute == "something"
#errors if local.value is null`
instead of this:
expr = local.value != null ? local.value.attribute == "something" ? true : false : false
If there was support for user defined functions, things such as custom map projections or other ways to handle incompatible types becomes a pain staking error prone process. Most of the time (in my experience), current handling of types are a hurdle that does not solve a real problem and introduces others.
I know the mileage may vary and type checking prevent some issues depending on who is writing the code, but the way it is designed needs rethinking as it pulls the carpet on one side to cover the other.
By no means I am unappreciative of all the work put in terraform ecosystem (to me the best IaC framework), but sometimes certain parts of a design don't fit model and I think this is one of those cases. This is the only point I have in favor of some other products that have been showing up and allow the user to use his language of choice for the IaC (which in my opinion is a mistake in itself, like trying to use C++ instead of HTML for the frontend).
@albgus
replication_configuration = {for k,v in { role = aws_iam_role.replication[0].arn rules = [{ [...] }] } : k => v if var.replication }
I tried this, and it doesn't work for me, because there are references to other resources in this conditional block that end up being evaluated and throwing errors. For example:
│ Error: Invalid index
│
│ on ../../main.tf line 129, in module "source_bucket":
│ 129: role = aws_iam_role.source_replication_role[0].arn
│ ├────────────────
│ │ aws_iam_role.source_replication_role is empty tuple
│
│ The given key does not identify an element in this collection value: the collection has no elements.
Terraform Version
Terraform Configuration Files
works fine with