hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.41k stars 9.5k forks source link

shuffle method for list #20716

Open scottwinkler opened 5 years ago

scottwinkler commented 5 years ago

With the new expression support in 0.12, functions are more important than ever before. A shuffle() function which accepts a list as an argument and returns a randomized list which be extremely helpful. Instead of having to declare a random_shuffle resource using the random provider, I could simply call this shuffle() method to achieve the same result. This works better in for loops or dynamic blocks where I may not know how many times I need to call the shuffle function, and where I still want the distribution to be randomized uniquely each time.

apparentlymart commented 5 years ago

Hi @scottwinkler! Thanks for sharing this use-case.

The random_shuffle resource is a resource for an important reason: Terraform expects that the result of a plan/apply will be "stable", so it being implemented as a resource allows Terraform to remember the result of the random shuffle in the Terraform state and ensure that the same result is returned again on the next apply, unless you tell Terraform to regenerate it by changing the keepers.

Since functions are stateless, it wouldn't be possible for a shuffle function to remember its result between runs, and so the configuration could never stabilize.

I can see that using resources is inconvenient in situations where you don't know ahead of time how many lists need to be shuffled, as you said. I expect we will address this in future by making Terraform better support choosing the number of instances of a resource based on dynamic data, so that we can retain the ability to retain the result of the random generation in the state while allowing the resource type to be used in a more flexible way.

scottwinkler commented 5 years ago

Hi @apparentlymart thank you for clarifying but I still have some questions.

I understand it will create an unstable state for Terraform, but what if I am okay with this? I cannot imagine that this would cause an error from Terraforms perspective, because uuid() and timestamp() work fine even though they will be different on each Terraform apply. Having the resource option is a nice to have but feels clunky especially in the case where I don't know how many lists I need.

edit: if you give shuffle() an optional parameter for seed then you could return the same result each time if you want to

apparentlymart commented 5 years ago

uuid and timestamp are both considered to be bad features that are preserved only for compatibility. They do cause Terraform various problems that need to be worked around in various ways, and were particularly problematic during v0.12 development due to violating the assumption that functions will behave in a "pure" way when we re-evaluate the configuration during the apply phase and expect to get identical results.

You'll see that both of these now work slightly differently in v0.12 in order to partially work around those problems, but it is still hard to use them without causing yourself other issues. It is possible they will be deprecated and then removed in later releases if the workarounds added in v0.12 still don't fully address the issues, and so we don't intend to add any more impure functions like this that are likely to lead to similar problems. This particular function is likely to be even more problematic due to it working with lists rather than simple strings and thus potentially interacting with more complex features like for expressions, indexing, etc.

scottwinkler commented 5 years ago

@apparentlymart yeah that makes more sense when you put it like that but it would still be nice if there was another solution. For example, if there was a shorthand way of creating local resources whose sole purpose in life is to be consumed by another resource. A general solution would be to have a function that can generate a resource and store that in the state file like normal. An example might be

r = resource("random_shuffle","name",{input="${var.list}")

Then i could get the output attribute with r.result. This would enable me to work with resources in for loops more easily.

crw commented 6 months ago

Thank you for your continued interest in this issue.

Terraform version 1.8 launches with support of provider-defined functions. It is now possible to implement your own functions! We would love to see this implemented as a provider-defined function.

Please see the provider-defined functions documentation to learn how to implement functions in your providers. If you are new to provider development, learn how to create a new provider with the Terraform Plugin Framework. If you have any questions, please visit the Terraform Plugin Development category in our official forum.

We hope this feature unblocks future function development and provides more flexibility for the Terraform community. Thank you for your continued support of Terraform!

apparentlymart commented 4 months ago

In this particular case I don't think it would be possible for a provider to offer a function exactly as requested, because Terraform requires that provider-contributed functions behave as pure functions and so they cannot rely on randomly-selected numbers generated inside the function.

The closest that a provider-contributed function could get to achieving this would be to take an explicit seed as a second argument and then use that seed to populate the random number generator, in a similar way to how random_shuffle treats its own seed argument, except that for the function it would be required rather than optional.

If a future version of Terraform includes a concept like the "ephemeral values" idea I've been prototyping in https://github.com/hashicorp/terraform/pull/35078 then it might allow a compromise: a provider developer could signal in the provider schema that a particular function ought to be treated as ephemeral, which would then remove the requirement for the function to behave as "pure" in return for the result being usable only in contexts where ephemeral values are allowed. (There are some forward-compatibility questions there around how we could avoid an older Terraform version calling an ephemeral function on a new provider without understanding what it means for it to be ephemeral, so this would require some deeper consideration.)