Configurable installation strategy for external modules, similar to what we have for providers

jamengual commented 2 years ago

Current Terraform Version

All

Use-cases

Make registry.terraform.io a configurable parameter instead of a constant to be able to use a module/submodule internally hosted registry.

When using a module like so :

module "alb" {
  source = " source = "cloudposse/alb/aws"
}

the source URL basically translates to :

source = "https://registry.terraform.io/cloudposse/alb/aws"

if the constant mentioned in L24 was configurable it would be possible to serve the .well-known/terraform.json with the URL of the module registry and index pointing to an internal repo.

Right now the registry URL is configurable BUT the problem is that when using modules in the registry that use the short notation ie. source = "cloudposse/alb/aws" and that root module calls other submodules using the short notation then the root module will be pull from the internal configured registry URL by doing something like source = "pepe.myrepo.com/cloudposse/alb/aws" but the submodule will still have the short notation pointing to the registry and then the internally hosted index will not be used.

This is a very well used pattern in many languages were the repo of the package dependencies libraries can be configured and pointed to hosted version on products like jfrog artifactory, Nexus IQ, S3 and so on.

Attempted Solutions

It is not possible to configure at the moment and the only way to do it is to hack SSL CAs and hots tables to make this work which is definitely not a good solution.

Proposal

make the default registry URL https://registry.terraform.io configurable via config file in .terraform.rc or a ENV variable.

References

https://github.com/hashicorp/terraform/blob/main/internal/addrs/provider.go#L24 https://github.com/apparentlymart/terraform-aws-tf-registry

kmoe commented 2 years ago

Thanks for the report. Although it isn't documented (cc @laurapacilio), you can put a host block in the the .terraform.rc config file like the following:

host "registry.terraform.io" {
  services = {
    "providers.v1" = "https://host.example.com/v1/providers/",
  }
}

To confirm, is this what you meant by "the registry URL is configurable" - did you try this and see it not working for submodules?

apparentlymart commented 2 years ago

FWIW that host block is there primarily for us to do development and is potentially risky to use for other purposes because it will entirely override the services for a particular host, thereby defeating the assumption we rely on that we can change the set of services and associated URLs at any time and have all existing Terraform installations immediately respond.

Entirely overriding a particular host with your own services could work, but you'd need to keep referring to the upstream discovery document to see if the host has started supporting any new protocols or protocol versions in case you want to update your local copy; it might make later versions of Terraform fail in strange ways if they are relying on new host services not present today. The example above will already prevent installation of modules from the public registry, for example, because it would make Terraform think that there is no module registry on that host.

I think perhaps a more appropriate solution for this particular use-case would be to tell Terraform to install providers using the network_mirror installation method, and then publish copies of the providers you intend to use on your mirror server. That will then allow you to locally serve providers from any hostname, rather than just subverting Terraform's attempts to directly install from one particular origin registry.

See Provider Network Mirror Protocol for more information.

I'll leave this open so we can debate whether we want to prominently document the host block in spite of the risks of using it, but I don't expect we will allow any way to change the meaning of a source address with an implied origin registry, because that would change the identifier of a provider rather than just its installation location.

jamengual commented 2 years ago

I just want to make it clear that providers is not the issue here, modules are since modules can call other modules using the registry notation in the source block.

since the registry constant is not configurable it forces the user to fork ALL module dependencies and change the URL on each module which is far from ideal and makes uptream updates difficult, requires multiple pipelines etc. @kmoe

jamengual commented 2 years ago

Thanks for the report. Although it isn't documented (cc @laurapacilio), you can put a host block in the the .terraform.rc config file like the following:
host "registry.terraform.io" {
  services = {
    "providers.v1" = "https://host.example.com/v1/providers/",
  }
}
To confirm, is this what you meant by "the registry URL is configurable" - did you try this and see it not working for submodules?

yes that is what I meant as being configurable but as pointed out is not documented.

laurapacilio commented 2 years ago

I'm not opposed to adding it to the documentation if:

we think many others will want it as a workaround
there is no better alternative

We would need to include a warning though explaining any potential side effects of use that folks could run into. @apparentlymart and @kmoe If you think this is the best way forward, let me know and I can help open a PR to add this to our docs. Thanks @jamengual!

jamengual commented 2 years ago

I know I will sound like a broken record and I'm sorry but documenting this does not fulfill the use-case.

the host {} will work for only the root module (the first module declared) but not for any other submodules(child modules declared inside the root module code) that use the registry notation.

apparentlymart commented 2 years ago

Sorry for not reading clearly and assuming you were talking about providers. As some context for others reading, there was a parallel discussion about this with @jamengual in the HangOps Slack which had started with a question about addrs.DefaultProviderRegistryHost and so I extrapolated that the rest of the discussion and this issue were about providers, but on closer read I guess this was actually about addrs.DefaultModuleRegistryHost instead.

I'm going to try to elaborate here on some of the comments I made in the HangOps thread, both to try to make the points more clearly and also to record them here for posterity since this will be a location easier to find than a random thread in a Slack workspace.

I think it's important here to notice that the string registry.terraform.io is serving two distinct purposes in these compact address syntaxes:

Primarily, it serves as part of the unique identifier for the object it's referring to, so that different registries can safely have namespaces in which the same name means something different. We use DNS hostnames as part of our identifier scheme, following the lead of other designs such as Java's package naming convention, because it allows us to delegate the management of that namespace to the shared domain name system, rather than having to administer our own record of which name is claimed by which organization or individual.

registry.terraform.io therefore represents one of the two namespaces that we do directly administer, with the other being app.terraform.io for Terraform Cloud's private registry. Because the public registry happens to be a big rallying place for publicly-shared resources about Terraform, we made a special case that a shorthand address with no hostname means registry.terraform.io, but internally Terraform treats that as equivalent to explicitly writing out the registry.terraform.io/ prefix.
In order to provide a good default installation behavior, we additionally treat the hostname as "default" installation source for the object, so that in the common case we can always have some way to automatically install the object in question.

For providers in particular, we designed a number of mechanisms for customizing the installation strategy to use different installation methods, including installation from a local directory in the filesystem or installation from a separate network service that Terraform treats as a "mirror" of providers from an origin registry. That then allows separating the identity use-case of the hostnames from the installation source use-case. Terraform assumes, but cannot completely enforce, that anyone using these strategies will ensure that the alternative installation methods will return identical packages as the origin registry would've for the same address. (There is some enforcement of this if you let Terraform install from the origin registry at first and let it record checksums in the dependency lock file, but if you exclusively use a mirror at all times then Terraform will essentially treat that mirror as authoritative.)

Unfortunately, due to some long-standing technical debt there is no corresponding mechanism for customizing the installation methods for modules. We did originally intend to support similar mechanisms -- the dependency lock file, and custom installation methods/strategies -- for modules too. Unfortunately whereas Terraform's model for provider sources is a very strictly-specified address syntax with explicit meaning, Terraform's model for module sources is:write some sort of string into this argument and we'll use a bunch of heuristics to guess what you meant and always try to install something. Modules therefore don't have a reliable canonical identity for us to use in dependency lock files or in custom installation methods specified in the CLI configuration.

Just simply allowing customizing Terraform's module address parser to assume a different default hostname when one isn't specified is not a sufficient solution to this empasse, because:

It would not account for situations where a module author explicitly writes out registry.terraform.io/ on the front of the address, which is valid and supposed to be exactly equivalent to omitting it.
It would cause modules to have a different identity depending on where they are evaluated, which would undermine our ability to use module identity to address this use-case fully in future by allowing configurable module installation strategies.
It would not work for modules published in any other registry, or for modules specified using a physical source location instead of a registry address.

I suggest that we treat this issue as representing the well-known use-case of Terraform not supporting customizable installation methods for modules as we do for providers. I have a feeling we do already have some issue open for this somewhere, but I wasn't readily able to find it right now. Perhaps we'll find it later and can close this one as a duplicate once we do.

I agree with @jamengual that overriding the service discovery for registry.terraform.io doesn't seem like a valid solution to this problem, and I also hesitate to document it for the reasons I stated earlier. Instead, I think we should use this to prioritize tackling the aforementioned technical debt so that we can have a consistent model for thinking about module installation sources, similar but not necessarily identical to what we achieved for providers in Terraform v0.13, and then offering a real solution to this problem that tackles it at its root rather than introducing even more technical debt (making the implied module registry hostname vary depending on context) that will likely make it even harder for us to address this problem properly.

As with the provider address design, my proposed initial technical design requirements (subject to negotiation, of course) would be:

It's possible to select a different installation method for any module in any registry, and ideally for entire registries and registry namespaces at once as we allow for providers.
It's also possible to override installation methods for non-registry module sources, although it's likely that we'd need to accept the need to specify those addresses exactly when configuring, because those legacy syntaxes don't have a dependable structure that Terraform can dissect in the same way as its own address syntax. We should make a best effort to normalize these non-registry addresses in cases where there are multiple ways to write the same address, but I expect we won't be able to do so 100% reliably due to the ambiguous nature of the resolution heuristics.
Terraform still considers the modules to belong to the same registry namespace, but might install them from somewhere else. As an analogy, consider that configuring an HTTP proxy in a browser changes how the browser resolves URLs but does not change the meaning of any of the information in the URL for purposes such as origin-based access control, relative URL resolution, etc.
Assuming that there is something analogous to the provider network mirror protocol for modules, it's similarly designed in a way where it's relatively easy to deploy it just as a bunch of static files on a generic HTTP server, without the need to develop any custom server-side application code.

It will take some research and design work to get there, and we will probably need to allow ourselves some exceptions/oddities for the various bizarre non-registry source syntaxes Terraform has allowed since very early versions, but I believe it is solvable and that we should plan to solve it.

laurapacilio commented 2 years ago

Hello! First thank you, Martin (as always) for your very thorough and thoughtful explanation. Based on everything I'm seeing here, it does not seem like a docs quick-fix is the right way to go. I'm going to leave this issue open (of course!) so folks can find it and we can have a record of this conversation and the proposed work. But I'm going to remove the documentation label, as I think we've seen that this issue goes far beyond just being a documentation gap.

Thank you all for the discussion! Please let me know if anyone disagrees. Thank you!

archoversight commented 2 years ago

I created a new issue because I didn't think it fit this one directly, but one of the requirements I have is simpler, the environment I am using terraform in does not have access to the internet and can not download content from online sources, so I don't want to replace the URL with another one where it has to hit a web server of some sort, I would like to point at a folder on disk.

I want to be able to use modules that are developed by the community and have an easy way to mirror them + sub-modules they refer to, and have them in place on disk much like you can use terraform providers mirror and use those providers while running terraform init instead of pulling from the internet.

The only solution I've got so far is to pull them, rewrite any source = "namespace/module" with a path on disk, and store a copy. This makes it harder and more difficult to keep up to date.

jlforester commented 2 years ago

I'm trying to find a solution to this exact same issue for more or less the same use case as @archoversight. I've spent the last couple of days going through the source code and built a custom version that lets me override the DefaultModuleRegistryHost with a different hostname from an environment variable. I'm still testing this. It feels like a hack, however.

I think this issue is part of a broader one that is we would like to see better support in Terraform for air-gapped environments.

apparentlymart commented 1 year ago

I wrote a long comment above with various different concerns in it but I just want to reiterate the main tension in designing this:

In situations where the module author and the Terraform operator (the person running terraform commands) are the same person, it's true that "just" allowing changing the meaning of an address that lacks a hostname would be a relatively easy hack for achieving alternative installation sources for modules.

However, the design here must also accommodate the situation where those two are different. For example, we need to consider what happens for a publicly-shared module that refers to a hostname-free address with the assumption that (as documented) it's a shorthand for registry.terraform.io.

A successful design to address this issue must, I think, allow both the module author to unambiguously express what they intend their module to depend on, and allow the operator to configure how to fetch those dependencies.

If a module author writes source = "foo/bar/baz" then their intention is to depend on registry.terraform.io/foo/bar/baz, because that's how the source address syntax is defined. From the module author's perspective, registry.terraform.io/foo/bar/baz is the identity of the module, which also handily implies a default location to install it from so things "just work" for users in the common case.

However, an operator should be able to tell Terraform that they've mirrored registry.terraform.io somewhere else, so that any modules which depend on other modules on that hostname will be installed from the mirror instead of the origin registry:

The module would still be known to Terraform as registry.terraform.io/foo/bar/baz for the purposes such as deciding whether two addresses refer to the same source location, or (in future) tracking the module in the dependency lock file.
Terraform would install modules from the mirror regardless of whether the author wrote source = "foo/bar/baz" or source = "registry.terraform.io/foo/bar/baz", since both are equivalent. The operator always has the final decision on where a particular module gets installed from, without any need to change a module's identifier.

The provider installation method settings in CLI configuration offer a clear pattern for us to follow here if the mechanism is focused only on registry-based source addresses. The CLI configuration could include a block like this:

module_installation {
  network_mirror {
    url = "https://example.com/terraform-modules/"
  }
}

...which would then use that mirror for all registry modules, regardless of hostname.

Or, to specify it more finely, it could instead specify:

module_installation {
  network_mirror {
    url     = "https://example.com/terraform-modules/"
    include = ["registry.terraform.io/*/*/*"]
  }
  direct {
    exclude = ["registry.terraform.io/*/*/*"]
  }
}

This would solve the problem for all module registries, rather than just registry.terraform.io. It would also -- unlike the original proposal of simply changing how Terraform interprets the shorthand address syntax -- preserve the intention of the module author to depend on registry.terraform.io/foo/bar/baz even when on a particular computer that module has been mirrored in a different location.

However, there are two significant missing pieces here that also need to be solved:

For provider installation we use the CLI configuration's provider_installation block in conjunction with the dependency lock file to help operators ensure that their mirrors are returning something that is actually a mirror of upstream, and not a malicious replacement.

(This is a "trust on first use" situation, so it's not bulletproof: an entirely-new provider has no dependency lock entry to refer to. This is part of why terraform init produces extra messaging in its output whenever adding something new to the dependency lock file.)

We don't have a comparable second local source of checksum/integrity information for modules today, although we would like to. #17110 is representing that part of the problem already. (That issue actually predates the addition of the dependency lock file, because we had originally hoped to include both providers and modules in there at first release but ran into design challenges for modules and didn't want to delay adding version locking and extra integrity checking for providers.)
Not all modules are in module registries. Unlike providers, where registry-based addresses are the only supported address type, Terraform supports numerous different address types for module installation and their addresses are not regular enough nor unambiguous enough for it to be clear how to design filesystem and network mirroring strategies for those.

One potential answer is to just consider that a valid limitation and restrict custom installation strategies only to registry-shaped addresses. I could imagine justifying that as follows: the module registry protocol already exists to create an abstraction between module identifiers and their physical source locations and so custom installation sources for those is a logical extension of that mission. However, if someone specifies a physical source address directly then we can assume they are opting out of that abstraction and just want Terraform to use the address exactly as given.

This problem of how to support non-registry addresses was also the most significant challenge in specifying dependency lock file support for modules, so it would also be interesting to see if we could use the same justification to motivate supporting dependency locks only for registry-based modules. While I would certainly rather support it for module sources of all types, it could be pragmatic to say that those who want to benefit from both of these features should use registry-style addresses to do so, since registry-style addresses have the characteristics needed for these features to work while raw physical source addresses do not.

If we can convince ourselves that it's acceptable to limit both dependency lock file tracking and custom installation methods only to registry-shaped module addresses then I think we'd have a pretty clear path forward here. I've not yet done any research to see if that compromise is plausible. I'd be interested in feedback either way from those who are interested in this issue.

jamengual commented 1 year ago

I'm totally okay with being forced to use a registry-style address to be able to support this as long as the short version points to the long version address which is configurable (via one of your samples above). As you said the self-hosted or hashicorp registry will be basically the "valid/preferred" registry to pull providers/modules from, just as Artifactory/Nexus/snyk is for java dependencies when hosted internally.

displague commented 1 year ago

Coming over from https://github.com/hashicorp/terraform/issues/29362#issuecomment-1162459990, it sounds like this proposal will not address rep'ping the source with a local path override. Are there any creative alternatives that come out of the current thinking? File URLs?

The use-case I have is that published example modules, or modules included in provider examples/ directory, have to include user directions to change the source address depending on how the module is being consumed (clone, registry, e2e testing).

jamengual commented 1 year ago

this issue is more related to the fact that you can't have modules dependencies being pulled from a hosted registry, if we think of it from that point of view the fact that you have an example folder for your integration test module with a source of ../ should not be changed by this "feature/improvement" in my opinion.

If you think of it from the software development point of view, the test of an app usually lives on the same repo as the app and sometimes integration tests will live in another repo and the pipeline will trigger those steps independently and continue with the SDLC of the app. If you think about your integration test module in the example folder it is basically the same the example above so I will argue that in that case, you should push your test module to the registry (internal in this case) to pull from there the test module to run It, that is why I think not modifying the behavior of source = "../" is important.

displague commented 1 year ago

I appreciate your take, @jamengual.

I can see benefits to publishing the modules that are consumed by provider E2E tests. (terraform-provider-foo/examples run by terragrunt).

For E2E tests of standalone modules I suppose the recommendation would be to track separate examples/ and tests/, where the module when referenced in examples/ (terraform-foo-moda/examples/moda-ex1/) would depend on the registry and tests/ use local source paths.

apparentlymart commented 1 year ago

I showed network_mirror in my earlier example just because we'd been talking about the use-case of making requests to registry.terraform.io go to a different network host instead, which is what we call "network mirror" in the corresponding provider installation configuration.

I don't see any reason why we couldn't also support filesystem_mirror in a similar way that we do for provider installation, although as formulated it would be a directory containing local mirrors of potentially many different module packages following a prescribed directory structure (the directory structure is how Terraform will know which module address each directory is intended to represent) rather than for just a single module package.

There is a separate question of what we might call "development overrides", which we support for providers today in a special way that just tells commands like terraform plan to ignore whatever terraform init selected and installed and to just directly consult a local directory for a particular provider.

A nice thing about only supporting module-registry-style addresses is that all of these design ideas for providers can in theory be copied over relatively unchanged, aside from the simplification that modules are treated by Terraform as platform-agnostic and so we don't have to worry about multiple "builds" of the same module as we do for providers.

However, I'd prefer to focus only on the "mirroring" use-case for this issue, and then we can think about a story for "development overrides" separately later, which could just copy what we did for providers or we could use that opportunity to design something a little more holistic, like Rust's Cargo Workspaces or go.work files in Go. Let's figure out what the plan is for "mirroring" first, and then we can make a separate issue for making local development across multiple codebases more ergonomic once the immediate problem is solved.

dtscssap commented 1 year ago

+1 For what it's worth, I think that limiting both dependency lock file tracking and custom installation methods only to registry-shaped module addresses sounds like a fair compromise.

Would this restriction affect sourcing upstream modules from a local "network mirror" that embed within them relative pathing to source nested submodules?

e.g. https://github.com/apparentlymart/terraform-aws-tf-registry/blob/v0.0.1/store.tf#L2

apparentlymart commented 1 year ago

One detail that makes this a little tricky is the existing distinction between "module packages" and "module sources", which is something that is largely hidden in the details today but would probably end up more exposed if we implemented support for mirrors.

The easiest way to see the difference between a module source and a module package is to consider a source address like git::https://github.com/example/example.git//foo/bar. In this case, git::https://github.com/example/example.git is the module package -- a filesystem subtree that Terraform can request as a single unit -- and foo/bar specifies a subdirectory within that package.

Unfortunately this package vs. source distinction has an extra wrinkle for module registry addresses. A module registry is really just an extra indirection over physical source package addresses: if I ask the public Terraform Registry about hashicorp/subnets/cidr then it will tell me that a bunch of versions are available, and then after I choose a single version it will tell me to retrieve it using a source address like git::https://github.com/hashicorp/terraform-cidr-subnets.git?ref=v1.0.0.

The result of the registry protocol is another source address, and so although the above example doesn't do this it's valid in principle for a module registry to indicate that the underlying source is git::https://github.com/example/example.git//foo/bar, in which case the module package is git::https://github.com/example/example.git but the registry-style module source address actually refers to the foo/bar sub-path behind the scenes.

With all of that in mind, part of what we'll need to design here is what exactly a network mirror is returning. If we design the network mirror protocol by the same principles as the main registry protocol then the mirror will really just be an index of physical source addresses, in which case Terraform can treat them just the same way as the ones returned by the registry itself. I expect that's the most likely design for network mirrors.

We will also need to design the structure of a filesystem mirror, which makes things a little more tricky because I expect most would want a filesystem mirror to contain literally the source code of the module, rather than just a source address for Terraform to retrieve from elsewhere. For any module registry that would return a sub-path of a package as the location of a module, we'd need some way for the filesystem mirror to contain that same metadata. I expect it's doable, but still requires some consideration. A filesystem mirror for a registry module might require a small amount of additional metadata that isn't needed for a provider mirror where we can assume that "provider package" is an indivisible unit always referred to as a whole.

My point in mentioning all of this is that this source vs. package deal is also how Terraform deals with relative sources like ../foo: modules that coexist in the same package are allowed to refer to each other in that way, and so to successfully mirror a package containing modules that do that will require having a copy of the entire package rather than just the specific module in question. But as long as we can design this correctly to preserve the existing idea of module packages -- so that the unit of mirroring is an entire package rather than an individual source address -- the handling of relative paths should "just work", as Terraform already deals with those by just hunting for a matching directory in the same package as the caller.

dsmithbauer commented 7 months ago

So, there's a lot of discussion on this, but I'm just curious if any progress has been made here? Our organization would also benefit greatly from being able to manage modules and module mirrors more like providers.

hashicorp / terraform