hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.31k stars 9.49k forks source link

Replace Function Accepts Map of Replacements #31367

Open GabrielEisenberg opened 2 years ago

GabrielEisenberg commented 2 years ago

Current Terraform Version

1.2.1

Use-cases

There are multiple substrings that need to be replaced/removed. For example, the name of a script needs to be utilised in the name of an AWS Glue Job without the file extension and without certain characters. The script may be of various types, eg. .py or .scala.

Attempted Solutions

To solve for the use case we need to chain replace statements:

locals {
  script = "my-database/my-script.py" # Could also be "my-database/my-script.scala"
}

resource "resource" "resource" {
  name = replace(replace(replace(local.script, ".py", ""), "/", "-"), ".scala", "")
}

Proposal

The replace function should also accept a map (or some equivalent) which details every substring that needs replacement. Multiple substrings that need to be replaced with another substring could be included in a list. In essence an iterrable should be used. Here, the key is the target substring and the value is the existing substring or list of existing substrings. The above would look like:

resource "resource" "resource" {
  name = replace(local.script,
                {
                  "" = [".py", ".scala"],
                  "-" = "/"
                }
              )
}

References

Haven't found any.

apparentlymart commented 2 years ago

Hi @GabrielEisenberg! Thanks for sharing this use-case.

I think this proposal raises some interesting questions about order of operations: multiple patterns can potentially overlap one another in ways that would lead to a different outcome depending on what order you process the patterns in and whether you apply each pattern to the entire string in turn or scan the string only once and perform a replacement each time one pattern matches. I'm sure there are other permutations too, such as choosing the shortest or longest match when multiple patterns match at the same time.

Calling replace multiple times with each consuming the previous as its input creates one particular order of operations: apply each pattern to the entire string, and do that in an inside-out order, where only the innermost call can "see" the original string.

I think it'll be important to collect some more example use-cases for this before we could start designing it, because we'll want to try to learn which order of operations is the most useful to support as many realistic examples as possible, if we're going to include this as a built-in function.

GabrielEisenberg commented 2 years ago

It's a pleasure and thank you for the very interesting feedback @apparentlymart!

After speaking to a colleague on this, our thoughts are that one could simply apply the replacements in the logical order that they are specified. In the example above, one would first replace .py with "", then .scala with "" and then "/" with "-". Exactly as the chaining does. However, if the order of replacements is done poorly, the onus is on the user and they will need to conform to the way in which the function is intended to be used.

The intention of the feature request above is to make the code cleaner and more readable.

apparentlymart commented 2 years ago

Hi @GabrielEisenberg! Thanks for the additional context.

Since a mapping is not an ordered data type in the Terraform language, the exact syntax you proposed here would not give any information about what order to perform the operations in, but if just behaving as if it were multiple calls to the function feeding each output into the next input then indeed something like that using either a variable number of individual arguments (because arguments are inherently ordered) or a single argument taking a list of some type could be a viable design.

For something to be included in the core language it must both meet a common use-case (or, alternatively, clearly meet many use-cases that might not themselves all be "common") and have a "legible" design where a reader can ideally understand what it does without having to study the documentation in detail.

I think right now this idea hasn't yet met either of those thresholds, but we have a start in that direction and can continue the discussion here to refine it:

Another possible path here is to implement the desired behavior yourself as part of a provider plugin. We do not yet have a final design and implementation for providers plugins directly contributing functions to the Terraform language, but from a functional standpoint a data source that doesn't access any external services is equivalent to a function -- just with an admittedly far less convenient syntax.

I see a provider in the public registry whose documentation suggests it does something a lot like what you proposed here, although I haven't tested it so I cannot vouch for it:

https://registry.terraform.io/providers/poseidon/util/latest/docs/data-sources/replace

This would be my suggestion if you need something to meet this need right now and you don't find the nested replace calls sufficient. Possibly one day this provider will also be able to offer a function equivalent of the data source, which will thereby allay the typical concerns about the implications of including something in the core Terraform language, since there can instead be a number of different third-party implementations that make different interpretations of the requirements. (If one did emerge as the popular answer everyone reaches for, that would be a potential way to show that it's a good and intuitive design for inclusion in the core language, too.)

JohnLBevan commented 7 months ago

Ordered Replacement

Regarding the issue with maps not being ordered; lists are, so a simple fix for the example would be:

resource "resource" "resource" {
    name = replace(local.script,
        [
            ["", ".py", ".scala"]
            ["-", "/"]
        ]
    )
}

Here the value to be replaced is each item in each list's tail, whilst the head is the replacement value.

Gotcha

Of course, as @apparentlymart describes, this does still hide potential issues that are slightly more obvious when doing multiple replace statements; e.g. where the updated value from 1 update causes a downstream replace operation to replace something that hadn't been in the original string

resource "resource" "resource" {
    name = replace("this is a test",
        [
            ["x", "test", "hello"]
            ["awkward", "a x"]
        ]
    )
}

My Use Case

FYI: I found this bug as wanted something similar... In my use case, I want to generate resource names based on various variables. Those variables are used elsewhere so may contain characters that are invalid for my resource name, but are valid for other places they're used. Additionally I may wish to ensure that truncated names are unique by appending a hash (base64sha256) based on the original values to help ensure these generated names are unique, but the hash is likely to contain illegal chars (e.g. /). I could replace all illegal characters via regex: name = replace(local.nastyname, "/[/%]/", "X"). However, that makes the result less likely to be unique (i.e. as multiple characters are being mapped to a single char, so reduce entropy. Here something like XSLT's translate would be great; e.g. name = translate(local.nastyname, "/%", "XY"). The suggestion in this issue is less elegant that translate, but more flexible; so either would be good.