gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.82k stars 962 forks source link

Terragrunt IAC Engine Plugin System #3103

Open yhakbar opened 2 months ago

yhakbar commented 2 months ago

Summary

Introduce the ability to integrate with plugins to drive custom behavior in the underlying IAC tool orchestrated by Terragrunt (like OpenTofu or Terraform).

Motivation

Users have been lacking two significant capabilities that are addressed by this RFC:

  1. The ability to customize the usage of tofu and terraform when called by Terragrunt.

    Users have been relying on ensuring that a particular versions of tools have been set prior to executing terragrunt or utilizing a shim to alter the execution of the underlying IAC tool.

  2. The ability to alter the context of the IAC execution separate from the Terragrunt execution.

    So far, there has been no way to isolate the IAM access that the underlying IAC tool has from the access that Terragrunt has. The IAC tool has also had to run in the same compute environment and on the same filesystem as Terragrunt.

    Terragrunt super users would like to be able to isolate the compute resources allocated to Terragrunt from the compute allocated to the underlying IAC tool so that they can fan out IAC updates across multiple instances/containers/pods.

Proposal

Allow users to optionally specify an IAC engine, which will control how the underlying IAC operations like plans, applies, etc will be carried out instead of directly calling the tofu or terraform binaries.

Users will be able to use a configuration block that looks like the following to configure their engine in the relevant terragrunt.hcl:

engine {
   source  = "github.com/acme/terragrunt-plugin-custom-opentofu"
   version = "v0.0.1" # Optionally specify version
   type    = "rpc" # Optionally specify the type of plugin

  # Optional JSON serializable metadata that is engine-specific
   metadata = {
     tofu_version = "v1.7.2"
   }
}

The source field would be either the path to a local binary (signified by starting the value with . or /) or a URL pointing to a GitHub repository with a releases page containing an asset that can be used (the appropriate architecture and platform would be guessed based on detected values of the host machine, and can be explicitly set via environment variables).

The optional version field would indicate the git tag associated with the release to download, when the source is not a local binary. Throws an error if set for a local path, and is the latest release by default for remote sources.

The optional type field would indicate the type of plugin used by the engine. The default rpc value would indicate that the plugin is using HashiCorp's go-plguin and communicating with the plugin in a client/server relationship via RPC. For simplicity in authoring plugins, this will be the first type of plugin supported by this RFC. It's possible that in the future, a secondary type of plugin leveraging the Golang plugin package would be used with type shared.

Technical Details

This proposal impacts how and if tofu and terraform get called by Terragrunt.

To support this change, the following will have to be done:

To ensure that this functionality can be developed smoothly with minimal risk of regression, the functionality should be introduced under a feature flag that is enabled by setting the environment variable TG_EXPERIMENTAL_ENGINE=1. Users should be made aware that leveraging this functionality in production is risky while the functionality is being battle tested with additional warning logging.

Documentation will need to be authored that demonstrates how to write an IAC Engine plugin and guidance on testing it.

In addition, Gruntwork will host two plugins that will demonstrate how to author plugins following best practices:

They will execute tofu and terraform in the same way Terragrunt currently does. Users will be able to use the repositories as springboards for their custom implementation of the same functionality.

Press Release

A new engine configuration block has been released allowing you to customize and configure how your IAC updates orchestrated by Terragrunt!

To try it out, all you need to do is include the following in your terragrunt.hcl:

engine {
   source = "github.com/gruntwork-io/terragrunt-iac-engine-opentofu"
}

Due to the fact that this functionality is still experimental, and not recommended for general production usage, set the following environment variable to opt-in to this functionality:

export TG_EXPERIMENTAL_ENGINE=1

The next time you call Terragrunt, it will dynamically fetch and load the Gruntwork OpenTofu IAC Engine plugin for Terragrunt to use instead of calling OpenTofu directly.

You can find the plugin here. <-- This link is intentionally broken as this is a mock press release.

If you'd like to customize how OpenTofu is used when orchestrated by Terragrunt, feel free to fork the repository and call your own version of the plugin!

Drawbacks

Alternatives

Migration Strategy

This shouldn't result in any need for adjustments on the behalf of customers for their existing code bases to be compatible.

IAC Engines should remain an optional feature of Terragrunt for the foreseeable future.

Unresolved Questions

References

Proof of Concept Pull Request

Changes

yhakbar commented 2 months ago

Some feedback has been shared offline regarding the performance implications of introducing this plugin system that stemmed from a lack of clarity in this RFC regarding the difference between the shared and rpc plugin types.

Shared

The shared type of plugin leverages the built-in Golang plugin package. This kind of plugin is a shared library (typically having the extension .so) that would be dynamically loaded by the running Terragrunt process, and have its exported functions called directly from the Terragrunt process.

There would be no Inter-Process Communication (IPC) between Terragrunt and a second process running along with Terragrunt, and it would be largely equivalent to calling the functions from directly within the Terragrunt binary from a performance and usage perspective.

The downsides of this approach are that it requires that the plugin be written in Golang and one that is compatible with the version of Golang used to compile Terragrunt, see the warnings here. It also prevents most fault isolation of panics in the plugin, etc, as the plugin would be running in the same process as Terragrunt.

RPC

The rpc type of plugin leverages the HashiCorp go-plugin package. It is how provider plugins work in OpenTofu and Terraform. This type of plugin is spun up as a secondary process, and Terragrunt would establish a client - server connection with the plugin.

The advantages of this approach are that the plugin can be written in languages other than Golang, as long as they have good support for the protocol used by the plugin system (e.g. gRPC), and it allows for panics to happen in a secondary process, making it easier to prevent blow-ups in the engine from impacting the Terragrunt process. You can see a number of other advantages here.

The downsides of this approach are that there is a detectable impact to performance. There can be significant overhead in spinning up one or more Engine plugins, which then spin up one or more Provider plugins and having all that IPC happening.

yhakbar commented 1 month ago

Please take note that the default plugin type we will be exploring as part of this RFC is the RPC type.

This is due to the exploratory work done by @denis256 to make sure that the plugin system we build here will be maximally adoptable by the community. If you would prefer that we adjust this direction, please make your voice heard!