hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.55k stars 9.53k forks source link

module source in parent folder fails #28755

Closed schollii closed 3 years ago

schollii commented 3 years ago

Terraform Version

First seen in 0.14.x, and still present in 0.15.3

Terraform Configuration Files

Generate config files by running this script from a new folder:

mkdir a
mkdir b
mkdir c

cat > a/main.tf <<EOF1
data aws_region "current" {}
EOF1

cat > b/main.tf <<EOF2
module "a" {
source = "../a"
}
EOF2

cat > c/main.tf <<EOF3
module "b" {
  source = "$PWD/b"
}
EOF3

Debug Output

Crash Output

$ terraform init
Initializing modules...
Downloading /saved-deployment-configs/sandbox-vle/creation/demo/b for b...
- b in .terraform/modules/b
- b.a in 
╷
│ Error: Unreadable module directory
│ 
│ Unable to evaluate directory symlink: lstat .terraform/modules/a: no such file or directory
╵

╷
│ Error: Failed to read module directory
│ 
│ Module directory  does not exist or cannot be read.
╵

╷
│ Error: Unreadable module directory
│ 
│ Unable to evaluate directory symlink: lstat .terraform/modules/a: no such file or directory
╵

╷
│ Error: Failed to read module directory
│ 
│ Module directory  does not exist or cannot be read.

Expected Behavior

No crash. The b.a module should have resolved, OR there should be info about how to properly setup.

Actual Behavior

Crash, init failed

Steps to Reproduce

  1. In a new folder, create the bash script described in the config files section above, and run it (eg bash test-tf.sh)
  2. cd c
  3. terraform init

Additional Context

If you replace the module source absolute path in c/main.tf by relative path (source = "../b"), the terraform init is able to resolve b.a:

$ terraform init
Upgrading modules...
- b in ../b
- b.a in ../a
...

However, using a relative path in c/main.tf is not always an option, eg when c is a root module anywhere on a system, whereas the b and a are re-usable modules in some git repo that will be cloned somewhere, could be anywhere relative to c. Also, patching b/main.tf to use ./a instead, and creating a symlink from ./a to ../a is also not an option that is generally applicable, and makes b unusable from CI/CD etc.

Another workaround that is not a good solution is to make b/main.tf use absolute paths. This is not applicable because we usually do not know where modules a and b will be cloned onto a system, and source does not accept configuration via variables.

References

apparentlymart commented 3 years ago

Hi @schollii! Thanks for reporting this.

It seems like what you've countered here is an edge case of how Terraform thinks about "module packages", which is not really a mechanism we've explicitly documented because it's not normally very important, but is unfortunately quite important here.

This notion of a "module package" is a local filesystem tree that contains one or more modules distributed together. The most common example of a "module package" is a git repository, but this concept applies to any module source that doesn't start with ./ or ../ to indicate that it's part of the same package as its caller. The user-visible manifestation of this "module package" idea is Modules in Package Sub-directories, where the special double-slash delimiter marks the boundary between the containing package and the path within the package.

Our provider installer generally assumes (but, for historical reasons, does not enforce) that relative module sources with ./ and ../ will stay within the same "package", and thus Terraform only needs to preserve the directory structure within each package and not the directory relationships between packages.

This is relevant to the situation you've shown here because an absolute path starting with / is not a relative path within a package, and so Terraform is treating it as a separate external package just as if it were a remote Git repository. Just as for remote repositories, Terraform "downloads" (copies) the contents of that directory under .terraform/modules as a local cache of the package, and then expects any relative references inside to stay within the package. However, since your b module includes a relative address starting with ../, it actually leaves the "package" and thus violates Terraform's assumption.

Because Terraform is treating this absolute path in the same way as any other non-local reference, you can influence where Terraform draws the package boundary by using the // separator in the same way as with a Git module:

module "b" {
  source = "/tmp/example//b"
}

With this syntax, Terraform understands that you intend to treat everything in /tmp/example as a single package, and so it will "download" (copy) that entire directory into .terraform/modules as a single unit, and thus the ../a reference from b will successfully resolve.

For a while now we've been wanting to make this "package boundary" idea more explicit so that Terraform can give a better message in cases like these, but we've been unable to do it because we can't be sure that folks aren't relying on unintended results of this edge-case behavior. However, your situation here makes me think of a potential compromise: we could change the error handling in this case so that if installation was going to return an error anyway then we would check if the source address is a relative path which "escapes" its containing package, and if so return a special error message that reports that as the problem, rather than the current message which merely reports the end-effect of the problem.

Using absolute paths within the local filesystem is not a common situation and so the current Module Sources documentation doesn't really discuss it explicitly. We might consider also adding a note to that page to make it clearer that "Local Paths" specifically means relative paths within the same package, which is a distinct idea from an absolute path that happens to be in the local filesystem. I wouldn't generally recommend doing that, but I can see that it could be useful in some less common situations such as when your root module is generated programmatically on the fly by something which relies on the filesystem layout of the current system.

Since Terraform is working as designed here I'm going to relabel this issue as an enhancement so that we can use it to prompt thinking about improvements to the error handling and to the documentation. Changes to the actual behavior (such as redefining "local path" to also include absolute filesystem paths) aren't really on the table here because that would be a potential breaking change for existing users with working configurations.

Thanks again!

schollii commented 3 years ago

Thanks for the detailed explanation, I will give that a shot.

It would probably be sufficient to add a short section about local path in the module docs, explaining how // is interpreted in terms of package and how that plays with relative paths up the tree. I myself looked there and even tried // in a few places, but only in the path to a from b, I didn't think of trying on path to b from c. I guess I wasn't really understanding how it works but your explanation helps a lot.

Would it help if I submitted a PR for a small mod on that documentation page?

apparentlymart commented 3 years ago

Hi @schollii!

Unfortunately since I wrote my comment above I've not looked back at GitHub while I was working on #28781, and so I missed your offer to add some documentation yourself and I've made a small documentation mod as part of that PR, along with (the main point of the PR) adding a better error message for the situation you encountered here.

I didn't want to spend too many words talking about your specific situation in the docs because it doesn't seem to be something that folks encounter frequently, but hopefully this new error message along with the additional doc text will be good enough to hint to anyone who does encounter it that Terraform treats "module packages" in a special way and that an absolute path is a funny example of a module package.

Thanks again for reporting this!

schollii commented 3 years ago

Thanks for the code fix. However, I created a separate issue on the docs you fixed: I find the new docs are overly opinionated because they assume they know all context of use of paths, and they treat relative and absolute paths differently whereas a shell does not (they are just different ways of accessing a resource -- here a module).

apparentlymart commented 3 years ago

Given the existence of your other issue, I think this one is now redundant (fixed by that PR) and so I'm going to close it.

Note that what I documented in that PR was pre-existing behavior that was just not explicitly called out in the docs before. The behavior is long standing (going back at least to Terraform v0.12, and I think even prior to that although with some different details) and I share your concern that it's not a great behavior although clearly we disagree about what behavior would be better. We'll use your other issue to represent the use-case you're concerned about, but as I noted over there we will not be able to change the behavior of any existing syntax in order to meet that use-case, because that would break compatibility for existing deployments relying on external module packages always being installed into .terraform.

github-actions[bot] commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.