hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.71k stars 9.55k forks source link

.terraform.lock.hcl issues with cross-architecture workflows #31435

Open alexjurkiewicz opened 2 years ago

alexjurkiewicz commented 2 years ago

Terraform Version

1.2.4

Description

When running terraform with a lockfile generated on one arch (darwin_amd64), terraform cannot successfully init & plan on another arch (linux_amd64) without modifying that lockfile.

A summary of the expected working workflow:

  1. Write Terraform configuration on Mac
  2. Run terraform init to generate .terraform.lock.hcl
  3. Push code to github
  4. CI on Linux runs terraform init -lockfile=readonly and doesn't attempt to modify the file
  5. CI on Linux runs terraform plan successfully

The last two steps currently both fail. Full replication transcript:

Base environment has no global settings ``` ~/temp ❯ env | egrep '^(TF|TERRAFORM)' ~/temp ❯ mv ~/.terraformrc ~/.terraform.bak ~/temp ❯ ls ~/.terraform.d checkpoint_cache checkpoint_signature credentials.tfrc.json ~/temp ❯ ls -a . .. main.tf ```
Steps 1 & 2 (write configuration & init on Mac) ``` ~/temp ❯ cat main.tf provider "aws" { region = "us-east-1" } ~/temp ❯ terraform init Initializing the backend... Initializing provider plugins... - Finding latest version of hashicorp/aws... - Installing hashicorp/aws v4.22.0... - Installed hashicorp/aws v4.22.0 (signed by HashiCorp) Terraform has created a lock file .terraform.lock.hcl to record the provider selections it made above. Include this file in your version control repository so that Terraform can guarantee to make the same selections by default when you run "terraform init" in the future. Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ~/temp ❯ cat .terraform.lock.hcl # This file is maintained automatically by "terraform init". # Manual edits may be lost in future updates. provider "registry.terraform.io/hashicorp/aws" { version = "4.22.0" hashes = [ "h1:fmPkEDTodRW9XE0dqpTzBFUtfB3nYurbwzKy//8N93o=", "zh:299efb8ba733b7742f0ef1c5c5467819e0c7bf46264f5f36ba6b6674304a5244", "zh:4db198a41d248491204d4ca644662c32f748177d5cbe01f3c7adbb957d4d77f0", "zh:62ebc2b05b25eafecb1a75f19d6fc5551faf521ada9df9e5682440d927f642e1", "zh:636b590840095b4f817c176034cf649f543c0ce514dc051d6d0994f0a05c53ef", "zh:8594bd8d442288873eee56c0b4535cbdf02cacfcf8f6ddcf8cd5f45bb1d3bc80", "zh:8e18a370949799f20ba967eec07a84aaedf95b3ee5006fe5af6eae13fbf39dc3", "zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425", "zh:aa968514231e404fb53311d8eae2e8b6bde1fdad1f4dd5a592ab93d9cbf11af4", "zh:af8e5c48bf36d4fff1a6fca760d5b85f14d657cbdf95e9cd5e898c68104bad31", "zh:d8a75ba36bf8b6f2e49be5682f48eccb6c667a4484afd676ae347213ae208622", "zh:dd7c419674a47e587dabe98b150a8f1f7e31c248c68e8bf5e9ca0a400b5e2c4e", "zh:fdeb6314a2ce97489bbbece59511f78306955e8a23b02cbd1485bd04185a3673", ] } ~/temp ❯ rm -rf .terraform ```
Steps 4 & 5 (run configuration on Linux) ``` ~/temp ❯ docker run -i -t -v $(pwd):/mnt -w /mnt hashicorp/terraform:1.2.4 init -lockfile=readonly Initializing the backend... Initializing provider plugins... - Reusing previous version of hashicorp/aws from the dependency lock file - Installing hashicorp/aws v4.22.0... - Installed hashicorp/aws v4.22.0 (signed by HashiCorp) ╷ │ Warning: Provider lock file not updated │ │ Changes to the provider selections were detected, but not saved in the .terraform.lock.hcl file. To record these selections, │ run "terraform init" without the "-lockfile=readonly" flag. ╵ Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ~/temp ❯ docker run -i -t -v $(pwd):/mnt -w /mnt hashicorp/terraform:1.2.4 plan ╷ │ Error: Required plugins are not installed │ │ The installed provider plugins are not consistent with the packages selected in the dependency lock file: │ - registry.terraform.io/hashicorp/aws: the cached package for registry.terraform.io/hashicorp/aws 4.22.0 (in .terraform/providers) does not match any of the checksums recorded in the dependency lock file │ │ Terraform uses external plugins to integrate with a variety of different infrastructure services. To download the plugins │ required for this configuration, run: │ terraform init ╵ ```
Changes Terraform wants to make to the lockfile on Linux ``` ~/temp ❯ rm -rf .terraform ~/temp ❯ docker run -i -t -v $(pwd):/mnt -w /mnt hashicorp/terraform:1.2.4 init Initializing the backend... Initializing provider plugins... - Reusing previous version of hashicorp/aws from the dependency lock file - Installing hashicorp/aws v4.22.0... - Installed hashicorp/aws v4.22.0 (signed by HashiCorp) Terraform has made some changes to the provider dependency selections recorded in the .terraform.lock.hcl file. Review those changes and commit them to your version control system if they represent changes you intended to make. Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ~/temp ❯ cat .terraform.lock.hcl # This file is maintained automatically by "terraform init". # Manual edits may be lost in future updates. provider "registry.terraform.io/hashicorp/aws" { version = "4.22.0" hashes = [ "h1:RxPzK6VFHz6qZMZUVhE03j9Cf5CvnLr14egtq5yxD1E=", "h1:fmPkEDTodRW9XE0dqpTzBFUtfB3nYurbwzKy//8N93o=", "zh:299efb8ba733b7742f0ef1c5c5467819e0c7bf46264f5f36ba6b6674304a5244", "zh:4db198a41d248491204d4ca644662c32f748177d5cbe01f3c7adbb957d4d77f0", "zh:62ebc2b05b25eafecb1a75f19d6fc5551faf521ada9df9e5682440d927f642e1", "zh:636b590840095b4f817c176034cf649f543c0ce514dc051d6d0994f0a05c53ef", "zh:8594bd8d442288873eee56c0b4535cbdf02cacfcf8f6ddcf8cd5f45bb1d3bc80", "zh:8e18a370949799f20ba967eec07a84aaedf95b3ee5006fe5af6eae13fbf39dc3", "zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425", "zh:aa968514231e404fb53311d8eae2e8b6bde1fdad1f4dd5a592ab93d9cbf11af4", "zh:af8e5c48bf36d4fff1a6fca760d5b85f14d657cbdf95e9cd5e898c68104bad31", "zh:d8a75ba36bf8b6f2e49be5682f48eccb6c667a4484afd676ae347213ae208622", "zh:dd7c419674a47e587dabe98b150a8f1f7e31c248c68e8bf5e9ca0a400b5e2c4e", "zh:fdeb6314a2ce97489bbbece59511f78306955e8a23b02cbd1485bd04185a3673", ] } ~/temp ❯ diff .terraform.lock.hcl .terraform.lock.hcl.orig 7d6 < "h1:RxPzK6VFHz6qZMZUVhE03j9Cf5CvnLr14egtq5yxD1E=", ```

Additional Context

I shared this transcript initially with Martin Atkins on Slack, who said:

Can you share that in a GitHub issue so the team can debug it? If you do, it would be helpful to see what the lock file contained after each unit, too. The second step where the read-only mode succeeded even though the lock file was insufficient is definitely a bug, like I was saying above. The checksums not being generated correctly in the first place also seems like a bug, but I'm not really sure what's going on there since I exercise that part (just a normal unit, not read only) many times each day and haven't seen it do this, so I'm curious to see whether the second step is somehow the one messing it up (but I can't yet imagine how)

apparentlymart commented 2 years ago

Thanks for sharing this, @alexjurkiewicz!

At first look I can at least confirm what I was saying in the Slack conversation: Terraform did successfully record all of the signed checksums from the registry for this provider in the lock file on your first run, because otherwise those 12 zh: checksums would not be there.

Based only on the output and my mental model of the behavior, I think what's happened here is that when running inside Docker container:

This annoyance of the registry protocol giving us legacy hashes that we can't use for anything except the registry is a well-known issue at this point, covered by #29958. This is a different way to get into that situation, but the requirement is the same: we need to somehow get the equivalent h1: hash into the lock file, which we can achieve either by letting terraform init update the lock file itself (run it without -lockfile=readonly) or by explicitly running terraform providers lock to precalculate all of the needed h1: checksums.

However, I think this issue does represent a new bug, separate from #29958: terraform init -lockfile=readonly should not indicate success if updating the lock file is required in order to make the working directory functional. Instead, I think terraform init -lockfile=readonly should've treated this in a similar manner to what happens if the configuration needs an entirely new provider that wasn't previously locked: to generate an error announcing that the lock file is not complete enough and so Terraform cannot proceed without updating it.

Error: Provider dependency changes detected

Changes to the required provider dependencies were detected,
but the lock file is read-only. To use and record these requirements,
run "terraform init" without the "-lockfile=readonly" flag.

The current condition that decides whether this is an error or a warning is whether the set of required provider addresses is identical before and after:

https://github.com/hashicorp/terraform/blob/f30738d96569d54f476214d3a6f0a9bc0d00c3e8/internal/command/init.go#L856-L875

Perhaps a more robust definition is that terraform init should perform exactly the same logic that other commands use to verify that the lock file is consistent with the .terraform/providers cache directory, and if it detects an inconsistency then that means that it isn't possible to initialize with -lockfile=readonly and so we should report this as an initialization error.

(I'm focusing only on the error handling behavior here because #29958 already covers the broader problem with the registry protocol. I'm guessing that we can change terraform init to give better feedback about this information relatively easily, whereas changing the registry protocol requires design coordination with various teams that own implementations of the protocol, including the public registry.)

veganbeef commented 1 month ago

A quick summary of the recommended workaround for anyone attempting to generate / update lock files from MacOS and then init using -lockfile=readonly in a GitHub CI workflow:

On your Mac, after running terraform init, run terraform providers lock -platform=linux_amd64 -platform=linux_arm64 -platform=darwin_amd64 -platform=darwin_arm64 to update the .terraform.lock.hcl file to include hashes for GitHub CI and for your local machine.

Then the plan and apply commands will work on all non-windows architectures, and your dependency lock file can serve its purpose of ensuring only developer-approved packages are ever installed, even if some remote server is compromised.