hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io
Other
42.76k stars 9.56k forks source link

terraform get: can't use variable in module source parameter? #1439

Closed amaczuga closed 7 years ago

amaczuga commented 9 years ago

I'm trying to avoid hard-coding module sources; the simplest approach would be:

variable "foo_module_source" {
  default = "github.com/thisisme/terraform-foo-module"
}

module "foo" {
  source = "${var.foo_module_source}"
}

The result I get while attempting to run terraform get -update is

Error loading Terraform: Error downloading modules: error downloading module 'file:///home/thisisme/terraform-env/${var.foo_module_source}': source path error: stat /home/thisisme/terraform-env/${var.foo_module_source}: no such file or directory
radeksimko commented 9 years ago

I'm trying to avoid hard-coding module sources

Is there any particular reason behind that? Do you expect some modules to have the same interface, so you can swap these?

mitchellh commented 9 years ago

This is as intended. We should add validation that this isn't allowed.

The reason is simply that it breaks our compile -> semantic check -> execute loop. i.e. imagine if your C code could arbitrarily download new C files during compile/execution. Would be weird.

amaczuga commented 9 years ago

Do you expect some modules to have the same interface

yes, that is exactly my point - for the flexible running plans against various versions/forks of identically interfaced modules, without refactoring base terraform code

amaczuga commented 9 years ago

This is as intended.

Er. Forgive me - I'm lost here, due to labels - that is - marked bug, yet your comment suggest a wontfix

radeksimko commented 9 years ago

marked bug, yet your comment suggest a wontfix

The fix is to add the validation so you get something a bit more clear rather than "error downloading module" I guess.

clstokes commented 9 years ago

FWIW, this is something I wanted to do as well and found wasn't supported. In my case, I wanted to avoid duplicating git::ssh://git@github.com/... across tens or hundreds of files and do something like source = "${var.module_path}//modules/common-vpc".

clstokes commented 9 years ago

Also to set the branch/tag via a variable would be helpful...

radeksimko commented 9 years ago

Also to set the branch/tag via a variable would be helpful...

See https://github.com/hashicorp/terraform/issues/1145

clstokes commented 9 years ago

@radeksimko I'm familiar with ref as added in a recent version, but I'm suggesting something like source = "github.com/clstokes/terraform-modules//modules/common-vpc?ref=${var.module_branch}".

Are variables allowed at all in modules sources?

mitchellh commented 9 years ago

@clstokes They're not yet

ketzacoatl commented 9 years ago

+1 on this. I want admins and automated-ci to be able to specify the local path, allow flexibility to pull from git or filesystem, etc, but this is not possible without allowing interpolation in the source param.

mitchellh commented 9 years ago

This is not a bad idea but it is very hard to do with the current architecture of how modules work with Terraform. It also shifts a lot of potential errors away from a compile-time error to a runtime error, which we've wanted to avoid. I'm going to keep this tagged with "thinking"

ketzacoatl commented 9 years ago

@mitchellh, how are compile-tile and runtime differentiated in Terraform? Are you referring to tf plan vs tf apply? If this is the case, I would like to share my experience as a user has never built confidence in tf apply succeeding if tf plan succeeds.

Said another way, TF as it is right now gives me a lot of compile time and runtime errors. For example, you can easily tell TF to create an SSH key that seems fine with tf plan but errors out with tf apply. Either way, my vote for unblocking this capability (understanding it isn't simple, given current architecture) stems from wanting the ability (as a user) to choose whether or not a variable in the module source is a good decision for my code. Thanks for listening :)

pikeas commented 9 years ago

+1, I understand why this may be architecturally tricky to get right, but it would be great to have on the admin/DRY side of things.

kaelumania commented 9 years ago

+1 I also think that the gained flexibility would outweigh the disadvantages.

kokovoj commented 9 years ago

:+1:

Use-case for this would be allowing for the flexibility to store module source in a variable for :

a. module source pointing at a corporate source control behind a corporate VPN

variable "your_project_source" {
  default = "https://your_src_system/your_project//terraform"
}

OR b. use a local path on the dev box (after that src was already checked out locally, so don't need to be on the corporate VPN)

variable "your_project_source" {
  default = "/Users/joeshmoe/projects/your_project/terraform"
}

(and overriding one or the other in terraform.tfvars) and then

module "your_project" {
  source = "${var.your_project_source}"
  ...
}
apparentlymart commented 9 years ago

One very specific complexity with this is that currently modules need to be pre-fetched using terraform get prior to terraform plan, and currently that command does not take any arguments that would allow you to set variables. By the time plan is running, Terraform is just thinking about the module name and paying no attention to the module source, since the module is assumed to already be retrieved into the .terraform subdirectory.

Perhaps in some cases this could be worked around by breaking a configuration into two separate runs, with an initial run creating a remote state that can be consumed by the second run. Since terraform_remote_state is just a regular resource its configuration arguments can be interpolated, even by things that aren't known until apply time, as long as a dependency cycle doesn't result.

This is of course not as convenient as creating everything in one step using directly-referenced modules, but maybe it's a reasonable workaround for some situations in the mean time.

apparentlymart commented 9 years ago

@kokovoj 's use-case, of switching to a different version in a development environment, got me thinking about how that gets solved in other languages.

When I have a problem like that in e.g. Go, NodeJS or Python I don't use any runtime features to solve it, but rather I just ignore the location/version of the module given in the dependency list and just install whatever one I want, exploiting the fact that (just like in Terraform) the "get" step is separated from the "compile" and "run" steps, and so we can do manual steps in between to arrange for the versions we want.

Terraform obscures this ability a little by storing the local modules in a directory named after the MD5 hash of the module name under the .terraform directory, so it's harder to recognize which one is which by eye... but you can, if you locate the right one, install it from a different source or modify it in-place. (I've done this several times while debugging, in fact.)

So with all of this said, perhaps Terraform could just be a little more transparent about where it looks for modules and embrace the idea that terraform get just installs the default module locations, but it's fine to manually install from other locations, or even to write your own separate tool to install from wherever you want. If we went this route, the only thing that would need to change in Terraform is to switch to a more user-friendly on-disk module representation and to commit not to change it in future versions of Terraform.

(It would also be nice to extend terraform get to be able to handle certain overrides itself, but that is made more complex by the fact that there can be nested modules that have their own dependencies, and so such syntax would probably end up quite complicated if it had to happen entirely on the command line.)

blalor commented 8 years ago

This is definitely something I'd like to see implemented. I'd be using it to solve the issue of where to pull a module's source when running in local dev or CI environments.

orclev commented 8 years ago

I'm also looking for a good solution to this. It seems like at the very least terraform get would need to be changed to support providing variables and to look in the conventional places for the variable settings.

Is the hash that modules get stored under a hash of the source attribute? If so it seems like so long as you hashed it after performing the interpolation everything would still be kosher. You would have had to run terraform get again if you modify any of the variables, so it isn't possible to self-modify included modules during a particular run of apply because the module won't exist yet. If you terraform apply with a different set of variables than you supplied for terraform get then it likewise would fail since the module wouldn't exist yet under the appropriate hash.

mmell commented 8 years ago

:+1: for @clstokes use case: https://github.com/hashicorp/terraform/issues/1439#issuecomment-93838774

esomore commented 8 years ago

:+1: for @kokovoj

jonapich commented 8 years ago

What threw me off here is the ${path.module} interpolation. I have a hierarchy like this:

|- ./dev  <--- this is my cwd
|- ./modules
|- ./modules/specific_module
|- ./modules/generic_module

I am using nested modules, so I thought I would use source = "${path.module}/../another_module" when loading the nested one and ran into the "source cannot use interpolation" error.

Turns out the interpolation is not necessary in this case! (wat?)

It is very surprising given the following documentation in the "How to create a module" "Paths and Embedded Files" section:

(...) since paths in Terraform are generally relative to the working directory that Terraform was executed from

In the above, we use ${path.module} to get a module-relative path. This is usually what you'll want in any case.

The section that follows, "Nested Modules", doesn't mention that loading a module from a module uses the module's path as the cwd as an exception.

franklinwise commented 8 years ago

+1 - It would be very helpful to be able to use a variable for the source, particularly targeting different branches.

johnrengelman commented 8 years ago

Another user case is providing credentials for accessing a private repository over HTTPS. Each user could set their own username & password then.

robcoward commented 8 years ago

As @johnrengelman mentioned, the use of private sources requiring user credentials is just not realistic if we are forced to hard-code the credentials. We have to be allowed to pass in credentials to private github repos etc via some mechanism, variable interpolation or whatever, especially if you are trying to use Atlas with Terraform, pulling config from a GitHub repo in the first place.

ketzacoatl commented 8 years ago

Standard/best-practices for accessing private repos on git is (in my experience) done with SSH and pub/private keys, not HTTPS with username/passwords.

cloudynetwork commented 8 years ago

@ketzacoatl For other uses cases that is a reasonable way to access private repos. However we are having to work around this lack of ability to authenticate securely via HTTPS by having external scripting download the private repos for us and then reference the modules locally...all a bit mess. I don't quite understand the resistance to being able to specify the credentials as a variable... Clearly people are in need of this.

robcoward commented 8 years ago

And how would you propose using ssh and pub/private keys to configure terraform to access a modules in a private github repo when running under Atlas ? Seriously, the only rational thing to do if running terraform through atlas, is to either allow variables to be used in module sources, or provide an alternative format for authenticating against private repos. You can't expect people to hard-code user credentials into a file that is checked into any form of repository, public or private.

ocxo commented 8 years ago

We created an outside collaborator in github that we granted read only rights to the remote modules it needs. This seems to be the least horrible way we've found to do this which doesn't compromise security of our repos too much.

ketzacoatl commented 8 years ago

@robcoward, another option would have Atlas adding support for the use case as well.

cloudynetwork commented 8 years ago

@fromonesrc However this still doesn't get around the need to actually pass a credential in. Whatever way we look at this, passing a credential (without having to hardcode it into the file) is a very basic and rather essential function.

Imagine if terraform didn't allow the passing of provider credentials such as AWS secret keys? It's logical to pass those ...so why is it not just as logical to be able to pass in the HTTPS credentials for github?

This features needs to be added already, no reason not to.

ocxo commented 8 years ago

Yep. I'm providing a workaround, not defending the current design.

brannn commented 8 years ago

What's the current Hashicorp position on this one?

Being able to dynamically call modules would be a breath of fresh air.

Consider a nested Consul module that is used to manage keys in multiple environments. Passing a Jenkins parameter that determines the module source during a 'terraform apply' build step would be a common scenario.

Having the same kind of interpolation freedom we get with providers would be very powerful in our situation.

hmcgonig commented 8 years ago

@heavyliftco I have a similar scenario to the one you listed. Being able to pass the source in as a variable (ultimately coming from the jenkins parameter) would make this feature so much more robust. +1 for this idea.

tchemineau commented 8 years ago

+1 for this idea

proffalken commented 8 years ago

My use-case is subtly different and yet I hope will lend some more support to this idea.

I am writing a project to be opensourced in the near future that will use Terraform to provision the infrastructure, however, I need to be able to allow the end-user to decide upon the cloud provider of their choice.

I've tried to do this:

variable "cloud_provider" {
  default = "aws"
}
module "cloud_provider" {
  source = "./${var.cloud_provider}"
}

However I get the error above.

The idea is that simply through running export TF_VAR_cloud_provider the appropriate module could then be included for AWS, Azure, DigitalOcean, GCE etc.

bam0382 commented 8 years ago

In my case, looking to reference file <module>/policies/role.json from a git based module, but after doing a terraform get, because the module path under .terraform/ is a bunch of numbers, can't reference that in the config.

module "iam" {
    source                        = "git::https://github.com/<org>/<iam_module>"
    ecs_role_file                 = "${path.module}/policies/role.json"

I get the following error:

Errors:

  * file: open <terraform_root>/policies/role.json: no such file or directory in:

${file(var.ecs_role_file)}

I want to be able to reference <terraform_root>/.terraform/<module_hash>/policies/role.json. Is there a variable that will do this?

johnrengelman commented 8 years ago

@bam0382 that's not really related to this issue, but have your tried passing:

ecs_role_file = "policies/role.json"

?

iroller commented 8 years ago

We also need to access private modules but do not want to hardcode password for the repo.

xuwang commented 8 years ago

There are use cases for dynamic module source and different auth methods. Before we have an "all-in-terraform" solution, this is what we do in a production ci/cd for now:

  1. pull terraform in build space
  2. pull modules from module repos with proper get and auth methods to a local module stage space.
  3. in terraform, modules' source always are "hard coded" to the predefined module stage space, for example source="../../modules/foo/bar"
  4. terraform get and apply

Sure the modules are double stored during the build, but it gives us more control on how and where to get the modules based on the build jobs and leave terraform alone to do what it does the best.

ketzacoatl commented 8 years ago

I've been doing more or less the same (relative paths with ../../tf-modules) for the last year+

This allows CI and developers to auth with git in the ways that are easy for them, and de-coupled from Terraform, as well as manage revisions/branches, and to hack on the modules as needed.. without disrupting TF or requiring complicated work-arounds.

benjah1 commented 8 years ago

+1 for dynamic modules load. This will bring a huge flexibility into terraform,

maxenglander commented 8 years ago

I am very interested in this feature as well.

The idea of parameterizing the terraform get stage, as articulated by @orclev, and requiring a re-run of terraform get whenever the module source field changes (i.e., the MD5 hash no longer matches the value of .terraform/modules/<md5>), seems to me like it would be an adequate for all of the use cases discussed so far.

@apparentlymart I think your comparison to how this is handled in Go, NodeJS, etc. is useful. In NodeJS for example, I can reference a package in package.json from origin, a remote clone, or a local clone. I might even reference an API-compatible package with a very implementation (e.g. React vs Preact). In this view, terraform get is analogous to npm install, both of which occur prior to runtime, the difference being that Terraform also does static analysis on the dependencies it downloads.

If Terraform were able to treat a change in the value of a module's source the same way it treats the presence of a newly defined module (i.e., requires user to re-run terraform get), then I believe the concern voiced by @mitchellh about shifting compile errors to the runtime would be neatly handled. If I'm understanding Mitchell's concern correctly (sorry if I'm not), this approach would treat source variables similar to C macros evaluated with #ifdef in which the interpolation and evaluation occurs prior to compilation during a pre-processing stage. Granted, in Terraform, there would have to be limits on what could be interpolated: probably only user defined variables could be allowed; resource attributes would have to be forbidden (I believe this is the same as the situation with the count parameter for resources).

I believe this solution could also be adequate for the credentials use case, articulated by @johnrengelman and others. I think the best way for Terraform to handle this would be to exclude the user's credentials from the module path, so, given a source = ${var.credentials}:${var.uri}, the credentials would be validated only during terraform get, and only the value of ${uri} would be evaluated when forming the MD5 hash of the module downloaded to .terraform/modules.

I believe this solution could also be adequate for the use case articulated by @kokovoj where a user wants to refer to a local module that may have drifted from origin because it is under development. I think in this case, requiring the user to re-run terraform get in order to fetch and store the local version isn't a huge hassle and, in theory, it only needs to be done once (unless the user changes the path of the local module, likely a rare situation). It might be a minor waste of disk space to have one locally downloaded module in .terraform/modules/<md5> that represents the origin module, and a nearly identical one representing the local dev version, but otherwise I don't see a drawback to allowing this duplication.

Finally, I think this solution would be adequate to the "driver" use case articulated by @proffalken. I don't think there's any difference from the viewpoint of Terraform between this use case and the local-vs-remote path use case.

Would be interested in hearing from others whether the approach of "parameterizing the terraform get stage" would address their use case, and from the HashiCorp team about whether this sounds like a feasible/reasonable approach. As a side note, I'd be very interested in taking a stab at this :).

esword commented 8 years ago

Adding a slightly different use case, similar to the credentials use case - we mirror our shared modules from a private GHE server into an external git repo. Initial testing and even plans are run from machines that have access to the internal GHE server, but the actual "apply" is run from systems that do not have access to our internal network and must access the modules from the external git repo.

blaltarriba commented 8 years ago

+1 for this feature.

I´m start using Terraform and when I will have dozens of projects using modules, if in some moment I decide to move a module to new source then I will have to change the source field in each project.

It would be great to be able to set the source field with interpolation sintax.

igormoochnick commented 8 years ago

+100

I'm at exactly this point in time that @blaltarriba mentioned - our terraform structure has reached massive size and I've started to move modules around into different locations and have to support versions for the nested modules

I need a way to split the module source path into 2 or 3 parts:

  1. "modules root" location - this is optional but I wanted to use it to control the location of the local or git-based modules
  2. "module id" - name of the module. This may be combined with #1
  3. "module version" - this is the MUST. Nested modules versions have to be different for different terraform runs
ketzacoatl commented 8 years ago

FWIW, @igormoochnick, I recommend simplifying. As an example: I have multiple environments in multiple accounts, all of which progress at their own pace, but still source from the same code base. I've found it's often easier to maintain multiple checkouts of the repos on disk, instead of a) using the git syntax for sourcing modules, and b) having some sort of versioning or other use of variables.

brikis98 commented 7 years ago

@xuwang's workaround is awesome. The fact that it works means there is nothing inherently wrong with dynamically specifying the path to a module, and since it can be quite useful for a variety of use cases (e.g. fast iteration while developing a module locally, controlling module version numbers in a single, central place, etc), there should be a way to do this within Terraform itself.

Perhaps Terraform should allow interpolation in the source parameter in "limited" way similar to the count parameter's limitations. Currently, you can use interpolation in the count parameter, but that interpolation is limited to data that can be resolved locally. For example, you could set count = "${var.some_variable}" but you could not set count = "${data.some_dynamic_data_source.value}". It seems like local variables could be processed during terraform get and could handle most of the use cases people are asking for in this issue.

xsellier commented 7 years ago

I'm working on a micro-services architecture. Each service has its terraform recipe. Since I'm using the standard git-flow, I use the develop branch to deploy on develop, and master for production. Each service has its own git repository, and because its a micro-service architecture, I have another git repository to specify which service I have to deploy. And this repository is generic. Meaning, it aggregates terraform recipes and deploy them. But it has to deal with branches (due to git flow). And because I don't want to hard-code everything, I need this feature !

module "gitlabrunner" {
  source = "git::ssh://git@gitlab.hibernum.net/service.git//deployment?ref=${var.branch}"

  environment = "${var.environment}"
  availability_zones = "${availability_zones}"
  cluster = "${var.cluster}"
}

I have a workaround:

# Terraform file
module "gitlabrunner" {
  source = "git::https://gitlab-ci-token:{CI_BUILD_TOKEN}@gitlab.hibernum.net/service.git//deployment?ref={BRANCH}"

  environment = "${var.environment}"
  availability_zones = "${availability_zones}"
  cluster = "${var.cluster}"
}

# Shell script
sed -e 's/{CI_BUILD_TOKEN}/'${CI_BUILD_TOKEN}'/gi' -e 's/{BRANCH}/'${BRANCH}'/gi' main.tmpl > main.tf