Closed bjf-rga closed 1 year ago
In hindsight, the aws portions of this are entirely irrelevant, but this was the simplest code I had on hand that reproduced the issue.
Hi @bjf-rga! Thanks for reporting this.
This problem was caused by a bug in Terraform v1.5 where its state parsing logic wasn't fully forward-compatible with unknown check types. That bug was fixed in Terraform v1.5.7, and so if you first upgrade everything you have using v1.5.6 to v1.5.7 and then start upgrading to v1.6.0 you should not have this problem.
Unfortunately since prior releases are immutable we cannot retroactively fix the bug in Terraform v1.5.6, but v1.5.7 only has two small changes relative to v1.5.6 (one of which is the fix to this problem) and so upgrading from v1.5.6 to v1.5.7 should not require any changes to any other part of your system.
Hello @apparentlymart! Thanks for the quick response.
Could you clarify the upgrade path here? I'm not having any problems going from 1.x -> 1.6.0. The issue I'm seeing is that once I've created a statefile with 1.6.0, no earlier versions of Terraform can read the state file. So using the TF above, let's say I do an initial apply with 1.5.6, then apply with 1.5.7, then apply with 1.6.0. This all works great.
But as soon as I switch back to a version of TF < 1.6.0, I can no longer initialize because none of the previous versions can parse the data in the check_results
section of the state file. I've not tested remote state reads yet, but what I'm seeing suggests that once I've done an apply with 1.6.0, there's no going back. And that would imply broken compatibility within 1.x.
Is this the expected behavior, or am I correct in thinking I should be able to use a version of TF < 1.6.0 to initialize against a stack that's been applied with >= 1.6.0?
Once you have a state snapshot that was created using 1.6.0 (assuming that you have custom validation rules for a variable, which is what causes the incompatibility), you cannot return to Terraform 1.5.6, but you should be able to downgrade to 1.5.7 instead.
As a follow up, I've done some remote state read testing, so I think I understand part of what you're saying. Using 1.5.7, I can do a remote state read of a statefile produced by 1.6.0. But no other version of Terraform can read the 1.6.0 remote state file.
This would mean that for me to update the Terraform version on any stack past 1.5.7, I would need every one of our partners (as defined by "folks reading the state files I'm producing") to be on 1.5.7+. That would prove a nearly insurmountable lift for my organization.
I can understand not being able to move backwards with the version of Terraform managing the stack, but shouldn't I have some guarantee that a statefile produced by a 1.x version of Terraform is remotely readable by other 1.x versions?
Indeed, if you intend to produce state files that can be read by Terraform v1.5.6 then you will need to either:
validation
blocks from your input variables in the meantime until all of your consumers are using a version of Terraform that doesn't have this bug (which is v1.5.7), since the recording of the outcomes of those rules is the new element that v1.5.6 cannot handleUpgrading from v1.5.6 to v1.5.7 is intended to be a trivial operation; other than this fix, it includes only a security fix for module installation that should not affect any non-malicious modules.
I should also note that although the forward-compatibility problem here was accidental and thus fixed in a patch release for the v1.5 series, our compatibility promises do not guarantee that it will always be possible to roll backwards; the compatibility promises are primarily about upgrading rather than downgrading.
You should be able to upgrade from any v1.x release to any later v1.x release. You might also be able to downgrade to an earlier v1.x release, but that isn't guaranteed: later releases may introduce new features that earlier versions cannot understand, including new storage formats for Terraform state snapshots.
In this case it was unintentional that v1.5.6 did not have sufficient forward-compatibility to read the newer version, and so that has been fixed in v1.5.7, but I'm pointing this out only because I think it's important to be aware of what is and is not promised for new versions in the v1.x series.
I appreciate the quick and thorough responses, although I can't say that I'm overly thrilled. This is going to prove a substantial barrier to being able to move Terraform in our organization beyond 1.5.7 (and I'm willing to suspect for many others). We have stacks that produce statefiles that are read by several consumers, and I cannot exercise control over the versions of Terraform that they're using.
I can understand and appreciate the argument about upgrading and downgrading, but I think this is larger than that -- a rather substantial backwards-compatibility issue has been introduced. The statefile is a contract between producers and consumers. Beginning with 1.6.0, that contract is now broken for consumers using an insufficiently recent version, despite operating within the same major version.
Was a version 2.0.0 considered for this release given that state files are no longer backwards compatible?
Thanks for that context, @bjf-rga.
Since this new kind of check was introduced in support of the new module testing framework (which uses check results in the state as part of the definition of whether tests are passing), I'm going to label this as feedback related to that feature, even though I understand that's not what you are concerned about here, just because I want to make this visible to the folks who were working on that to consider if there are any alternative paths forward that I'm not thinking of, since I wasn't working directly on this change.
Thanks. Again, I appreciate the quick feedback and the attention to the community. I'll see what creative solutions we can conjure to work through and around this. Thanks for raising visibility of this issue.
Thanks for raising this issue. We unfortunately caught this state interoperability issue well after its release. While as Martin says it's not strictly a violation of the 1.x guarantees, I recognize that it is enormously inconvenient for those sharing data using remote state across multiple Terraform versions.
As a result we merged the fix for this in #33815, and we triggered exceptional bug fix releases of 1.3.10, 1.4.7, and 1.5.7. Earlier Terraform minor releases are not affected.
I understand that you have no control over the Terraform versions used by your state consumers, but in case you can exert any influence, these releases give an option for working around this issue with minimal impact:
If it's an option for you, using the tfe_outputs
data source instead of the full remote state is generally more robust and unaffected by this bug. In our documentation, we also describe some alternative ways to publish data for external consumers instead of using remote state.
The only other suggestion I have for avoiding this problem is unfortunately to remove all uses of checks from your configurations: variable validations, preconditions, postconditions, and check blocks. After an apply, this will remove the check results from state.
I hope the patch releases mentioned above are viable for your organization. I'm sorry not to have better news here, and hope that if you find another workaround you'll let us know.
Just a quick update here, we're aiming to have a workaround in place for this in v1.6.2
.
It is important to note that, as highlighted by Alisdair, this issue affects Terraform releases in the 1.3, 1.4 and 1.5 series and has been patched in the latest releases for each of them. We're committing to keeping a workaround in place for the 1.6 series to provide more time to update to the latest patch release for each of the affected releases but we can't maintain this workaround indefinitely, and this error will eventually resurface as new types of custom conditions are introduced to Terraform and the variable condition is reintroduced to the state in the 1.7 series.
I've just merged #34058 into the 1.6
branch, so we should see any interoperability issues fixed in v1.6.2
.
I do want to highlight that this change is only temporary. The issue here is a bug in earlier versions of Terraform that has been fixed in the patch versions listed in this comment: https://github.com/hashicorp/terraform/issues/34014#issuecomment-1751683477. We will reintroduce the variable validations into the state file for the 1.7
minor series, so users of the affected minor series (1.3
, 1.4
, and 1.5
) should upgrade to the latest patch release for the relevant series in order to maintain compatibility when 1.7.0
is released.
Apologies for posting into closed issue, though I'm just trying to understand whether the below is a known behavior and this issue looks to be relevant as of compatibility point of view. Thanks.
terraform validate
for this var definition:
variable "test" {
type = number
validation {
condition = var.test > 0
error_message = "error"
}
}
This is what I get with TF 0.14:
Validation error message must be at least one full English sentence starting
with an uppercase letter and ending with a period or question mark.
TF 0.15:
│ The validation error message must be at least one full sentence starting with an uppercase letter and ending with
│ a period or question mark.
│
│ Your given message will be included as part of a larger Terraform error message, written as English prose. For
│ broadly-shared modules we suggest using a similar writing style so that the overall result will be consistent.
While TF 1.6 does not output any error 😕 The last TF version I get validation error with is 1.1.9
. Other versions (>= 1.2.0
) produce no validation error. Is this expected? Has error_message
validation rules been relaxed? If yes, where I can find an announcement on this?
Thank you.
Hi @yermulnik,
The behavior you've described here doesn't seem related to state snapshots and so isn't on topic for this issue.
If you'd like to discuss that behavior and what you have is a question rather than a specific bug report or enhancement request, please start a topic in the community forum about this question. Note the there is no version v0.16 of Terraform -- v0.15 was the last of the pre-v1.0 minor releases -- so I suggest also checking carefully which versions of Terraform you are running.
@apparentlymart Got you. Started a topic: https://discuss.hashicorp.com/t/variable-validation-error-message-requirements-relaxed/59951
Re TF versions it's me just typing too inaccurate. Meant 1.6, not 1.16
.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Terraform Version
Terraform Configuration Files
Debug Output
n/a
Expected Behavior
State produced by 1.6.0 should be readable by previous versions
Actual Behavior
During init with TF < 1.6.0, the error
Error refreshing state: unsupported checkable object kind "var"
is produced.Steps to Reproduce
Additional Context
The statefile produced by 1.6.0 has an additional
check_results
section in the statefile that does not appear to be processable by version before 1.6.0. In 1.5.6, this key exists but has a value ofnull
.References
No response