hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
41.63k stars 9.41k forks source link

Extending terraform with custom functions #27696

Open ghost opened 3 years ago

ghost commented 3 years ago

Current Terraform Version

0.14

Use-cases

Instead of writing a provider, there is some functionality that is best suited for a custom function.

Attempted Solutions

object({}) module, too complicated, at this point its better to just extend terraform with Go.

provider, too complicated, a function that takes one or two inputs and returns a single value is too simple for a provider.

Proposal

I cannot find any documentation on extending terraform with custom functions; is it possible to do this?

skyzyx commented 3 years ago

I've worked around this with sub-modules. Pass inputs to the submodule (parameters), do some processing (with locals), then provide an output (return). As long as it's something that Terraform already supports.

ghost commented 3 years ago

Yea, I found a way around this using yamlencode with a submodule that provides output. But, it's really too simple to be a module imo.

schollii commented 3 years ago

Defining a whole module just so you can re-use code is way more work than should be necessary. It should be possible to define a function in locals and call it, just like any other builtin function. The body of the function would just have to be a bunch of variable expressions like you find in locals, but for the difference that the function inputs would automatically be variables local to the function, and the return value would be a subset of the locals, something like this:

locals {
  func myfunc(var1: string, var2: list(string)) => map(string) {
     var3 = ...expression involving var1, var2, and any local.*...
     var4 = ...expression involving previous vars...
     return var3
  }
}

This basically just means that while in the function, the locals is extended with a few vars that get reset every time the function is called. Doesn't seem too crazy.

troisdiz commented 3 years ago

The function should only not be able to do any side effect, it should only return a value

AdamCoulterOz commented 3 years ago

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

AubreySLavigne commented 2 years ago

A use-case where I find myself wanting custom functions: when defining multiple resources with count or for_each, locals don't work well to transform derived properties. Inline logic works acceptably, but I would like to see if something better would work

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

Or a module. Perhaps a new resource type "library"?

skyzyx commented 2 years ago

Firstly, I think user-defined functions is a good idea for users, so I'm not arguing with anyone about that. Even more than that, I love the idea of Provider-provided custom functions.

But the language in this thread suggests that there's a belief that creating a module is a bunch of extra work/overhead, which is quite simply false.


@withernet said:

Yea, I found a way around this using yamlencode with a submodule that provides output. But, it's really too simple to be a module imo.

I don't view a submodule as anything more complicated than just some Terraform in a subdirectory. You're already going to write the logic; does it matter where it lives?

Modules aren't complicated. IMO, the majority of .tf files should be written as modules because there are a lot of benefits in exchange for mild overhead.


@schollii said:

Defining a whole module just so you can re-use code is way more work than should be necessary. It should be possible to define a function in locals and call it, just like any other builtin function.

The issue I take is with your statement: "Defining a whole module […]". Defining a module is not complicated; it's not a whole big thing — it's a tiny, little thing.

Unless HashiCorp were to implement user-defined functions in an unexpected way that completely blows my mind, I can't imagine the code you'd write to implement and execute user-defined functions would be very different from exposing parameters (as variables) and return values (as outputs) from a *.tf file in a subdirectory. Many of the primitives are already there (see other issues I've filed for a wishlist of new primitives I'd love to see added to Terraform).


@AdamCoulterOz said:

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

Empowering Providers to include context-specific functions would be amazing. Then again, this is also addressed with a vendor providing both a Provider as well as a Module.

Importing a "function" that someone else wrote?

module "imported_function" {
    source = "…"

    var1 = "abc"
    var2 = 123
}

# Return values are always a map/object (in the programming sense) on the module.
my_result = module.imported_function.my_return

When writing a "function", you use the variable, locals, and output primitives, which are the same as parameters, function body, and return concepts.

Instead of importing with go.mod/pip/npm/Composer/Bundler/NuGet/maven/whatever, you import via URL and git tag. You pass parameters, then access the result.

Could the syntax be simpler? Maybe/Probably. But probably not much simpler.

schollii commented 2 years ago

@shyzyx I dont' think anyone would argue that importing a function that someone else wrote is easy. In my experience this is a rare need, rather I find I often would like to refactor expressions (not 20 or 100 lines, just part of a line) that I use in a few places as part of transformations. For that, a module is way more work, and just does not scale well. If something is a lot of work, people won't use it and you end up with repetition.

if HCL supported user-defined functions:

  1. I wish I could refactor this expression into a function, because I use it in several places in this loop
  2. move cursor up and write this:
    locals {
        myfunc(a type, b type) -> type = { 
          ...code that uses a, b...
          return something
       }
    }
  3. move cursor back down to where you want to use the function and replace the expression with local.myfunc(a, b)
  4. DONE

With modules:

  1. I wish I could refactor this expression into a function, because I use it in several places in this loop
  2. Navigate to your filesystem in IDE or shell and create folder
  3. create main.tf in that folder
  4. define one variable entry per parameter
  5. write the same code as ...code that uses a, b... in the previous workflow
  6. define one output
  7. Navigate back to the window from step 1
  8. add a module "myfunc" block with values for params, which BTW is N+3 LOC for every "invocation"
  9. if you want to "call" the module in a loop, you need to add a for_each line in the module invocation, and in many cases you will have to rewrite your loop entirely so that the values to be computed can be used in a for_each (you will be duplicating the loop logic)
  10. move cursor back down to where you want to use the "function" and replace the expression with module.myfunc.something
  11. DONE
schollii commented 2 years ago

Here is another syntax, using the similarities between a function body and a locals:

myfunc {
   args = object({a=string, b=string}) // "args" is reserved keyword in anything but "locals"
   var1 = ...use attributes of args (eg myfunc.args.a), local.whatever...
   var2 = ...use attributes of args, local.whatever...
   return = myfunc.var2 // "return" is reserved keyword in anything but "locals"
} -> type

so defining and using a one-liner function could look like this:

myfunc {
  args = object({a=string, b=string})
  return = "${myfunc.args.arg1} + ${myfunc.args.arg2}"
} -> string

locals {
  var1 = {
      for k, v in var.map1: k => myfunc(a, b)
  }
}

New syntax is minimal:

To export a function from a module you could use

output "myfunc" {value = myfunc}

although for a module that acts as a library of functions this is onerous, better have a convention like in Go (hide by default, capitalize to export) or Python (export by default, prepend underscore to hide).

ghost commented 2 years ago

I think we're forgetting the reasoning why it's bad to use a module for a function; modules are not robust (for example, Go lang robustness), and therefore any kind of logic that may need to be maintained for any amount of time longer than a one-off function will become unmaintainable. For example, an unbearable dumpster fire of yamlencode is an unacceptable solution. Yes it works, but, the reality is that it's a hack to accommodate for a lack of functionality.

The initial reason why I opened this issue is because I wanted to export a function that was multi-cloud; to allocate resources between AWS, GCP, and AliCloud for a zookeeper, pulsar, and bookkeeper clusters; can anyone of you, with a straight face attest to how much of a solution a module would be for this?

While it's cool to suggest interim methods to deal with this problem; I'd rather the focus be on why this is essential functionality instead of how to come up with ways to just "work with how it is now".

aidan-mundy commented 2 years ago

Before making a fool of myself, please correct me if I am missing something. I am very new to Terraform and I may be using an antipattern or missing some other intended functionality.

That said, here is another example for why this is needed: I want to do some more complex validation on inputs to one of my modules. Validation blocks may not call other modules. The input is an object with some values that are optional(number) and some that are number. If the optional values are not null (and always for the required values), they must be integers, and they must be within a range that is different for each field. Currently, I have to copy-paste the same validation code multiple times. Here is a small excerpt of the condition: (Note: the "block" terminology in this code snippet is not related to Terraform blocks)

    condition = (
      (
        can(parseint(var.block_numbers.private, 10)) &&
        var.block_numbers.private >= 0 &&
        var.block_numbers.private <= 15
      ) && (
        can(parseint(var.block_numbers.public, 10)) &&
        var.block_numbers.public >= 64 &&
        var.block_numbers.public <= 79
      ) && (
        var.block_numbers.kubernetes == null
        || (
          can(parseint(var.block_numbers.public, 10)) &&
          var.block_numbers.public >= 96 &&
          var.block_numbers.public <= 111
          )
      )

This code seems absurd, to the point that I considered using regex to handle this, but that just seemed even worse and much less explicit.

It may make sense to give each value its own variable instead of being in an object, but that does not get rid of the code duplication.

skyzyx commented 2 years ago

@schollii: I was thinking about things in terms of how Terraform functions at its core today. What are things that could be small adjustments to how Terraform works under the covers that could expose the function-y functionality.

I don't work for HashiCorp and can't speak for them, but I've spent plenty of time poking around at the internals and have written a fair amount of code using their hclsyntax library and I feel like I have a decent understanding where Terraform exists at this point in history.


@aidan-mundy: While it seems like using modules for this is frowned upon by others, if you're looking for solution in current Terraform, I literally use modules for this problem of "shared validation".

Essentially, I don't use the built-in validate {} block for this. Instead, I collect the variables into a list/map in locals, then pass each of them through a module using module-level foreach. The module's one job is to (a) do nothing (if OK), or (b) fail with an error. After I pass the variables though the module, if I'm still alive, then everything passed validation.

Now, I'm supporting a lot of user-provided values, and I need to be able to access more than one variable at a time in order to validate.


@withernet: I'm going to boldly say that multi-cloud isn't a real thing. Now let me explain what I mean.

The APIs for each of these cloud providers are different, and even in equivalent services, the functionality of those services is often different or incomplete when comparing one against the other. I believe that this is why those "fog" libraries from 10–15 years ago all failed. For example Google Storage straight-up copied the Amazon S3 APIs at the time. But since then, the features of those services has diverged a bit.

Because of the way that the providers expose resources, I don't expect that building a single set of code (whether it's a re-usable module, or any other Terraform code) which supports multiple clouds at the same time based on something like cloud = "aws" is possible as Terraform exists today. In reality, you'd need to write three sets of HCL code to support three different cloud providers — assuming you're looking for standing up a stack in one cloud, then turning around and standing up the same stack in a different cloud. Alternatively, there is no issue with standing up a stack with resources in different clouds if you need to.

But having a single set of code where the public interfaces (variables + outputs) are identical across clouds? I am extremely skeptical about someone being able to pull that off in a meaningful, production-ready way.


All: Lastly, there's a question of "maybe this isn't the right tool for the job?" By which I don't mean Terraform, but rather HCL. Back when I started working on some of my modules, it was with Terraform 0.9 and Terraform 0.10. It wasn't powerful enough at the time to do the things I needed it to do, so I switched to using a programming language to generate the HCL I needed.

Even now, there is a case in one of my modules (for New Relic monitoring) where the module itself can't deduce certain information on its own. So I wrote a small Go program to hit some APIs, look up the data I need, generate a map of that data, use hclwriter to generate an HCL tree from that data, and write it to disk inside the module directory. Whenever I go to run the test suite or do a release build of the module (that other teams consume), I re-run the script to pick up the latest data and add it to the module's repo.

“When all you have is a hammer, everything looks like a nail.”

Perhaps looking elsewhere in your toolbox will allow you to find a better tool for the job so that you can get your work done, rather than being bothered that Terraform isn't the tool you want it to be, all by itself.

¯\_(ツ)_/¯

ghost commented 2 years ago

@aidan-mundy there is actually a bug (or feature requirement) to make it more useful that I've reported. Basically, in it's current state it's more of a problem than it's worth #28344 .

@skyzyx yes, the cloud APIs have different implementations. However, I'm not talking about making a single "universal" provider with a "single" universal function that has support between all cloud provider APIs. We have provision requirements that rest between the provision and configuration layers; so no cloud API requirements. A function that can be used between all of them that I can integrate into our configuration would be "meta" and usable between cloud platforms. You can think of this as additional "vendor" support. It's unrealistic to think vendors will only extend terraform as a provider.

aidan-mundy commented 2 years ago

@withernet Unfortunately, this does not appear to be related to my problem. I want the default to be null, and that ticket appears to only be for nested objects, which I do not have in this usecase.

@skyzyx I appreciate the suggestion, and may look into it in the future if this becomes a bigger pain point. That said, it seems horribly clunky and is not explicit about intent. To have to use a totally different functionality in the language to validate input variables when there is a "validation" block specifically for input variables doesn't make too much sense.

prologic commented 2 years ago

Looks like the underlying DSL used here HCL (Hashicorp Configuration language) actually supports user-defined functions -- So it might just be a matter of integrating this with Terraform itself (as the calling application).

aidan-mundy commented 2 years ago

@prologic hmmm, that looks deceivingly easy.......

prologic commented 2 years ago

Just wanted to say, that without Custom User-defined functions in Terraform, this is the type of butt ugly type thing I've had to do to drive the input of a resource from the output of another:

resource "swarm_cluster" "cluster" {
  dynamic "nodes" {
    for_each = concat(
      digitalocean_droplet.manager,
      digitalocean_droplet.worker,
      digitalocean_droplet.storage,
    )
    content {
        hostname = nodes.value.name
        tags = {
          "role" = contains(nodes.value.tags, "role:manager") ? "manager" : "worker",
          "labels" = join(
            "&",compact(
              [
                for tag in nodes.value.tags : (
                  contains(split(":", tag), "label") ? format("%s=%s", split(":", tag)[1], split(":", tag)[2]) : ""
                )
              ]
            )
          )
        }
        public_address  = nodes.value.ipv4_address
        private_address = nodes.value.ipv4_address_private
    }
  }
  lifecycle {
    prevent_destroy = false
  }
}

I hope this highlights the importance of this feature, which AFAICT is already baked into HCL itself.

redbaron commented 1 year ago

For any kind of table-driven configuration it is absolutely must have.

speller commented 1 year ago

Any updates on this? We have repeating routines that must be in a custom function to keep the code clean and easily maintainable:

  subdomain_name = trimsuffix(substr(replace(var.infra_name, "_", "-"), 0, 64 - length(var.tld) - 2), "-") # FQDN can not be longer than 64 chars because of SSL cert https://docs.aws.amazon.com/acm/latest/APIReference/API_RequestCertificate.html
  srv1_subdomain_name = trimsuffix(substr(replace("srv1-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")   
  srv2_subdomain_name = trimsuffix(substr(replace("${var.srv2_name}-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")
  srv3_subdomain_name = trimsuffix(substr(replace("${var.srv3_name}-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")

Ideally, this should look like this:

  subdomain_name = custom_function_trim_domain_name(var.infra_name)
  srv1_subdomain_name = custom_function_trim_domain_name("srv1-${var.infra_name}")   
  srv2_subdomain_name = custom_function_trim_domain_name("${var.srv2_name}-${var.infra_name}")
  srv3_subdomain_name = custom_function_trim_domain_name("${var.srv3_name}-${var.infra_name}")
couling commented 1 year ago

@skyzyx

But the language in this thread suggests that there's a belief that creating a module is a bunch of extra work/overhead, which is quite simply false.

Sorry for the late response, but this is worth discussion. I think you're being disingenuous. Frankly it is more work and overhead by way of making the code less maintainable.

The reasons why are relatively well documented in the software world with regards to languages generally:

Simply: functions and objects have two completely different use cases and trying to express one as the other leads to code that works but becomes unmaintainable too easily.

I'm not passing comment on your own code, @skyzyx. YMMV.

My own experience is that adding excessive modules and defining everything in expressions instead of a step-by-step functions leads complexity2 (complexity squared).


In an attempt to give a glimpse into what I mean:

foo(bar(baz(bob(bonno))))

The fact this is concise isn't the main advantage. This code shows a clear execution order at a glance. The full wiring is easy to understand. Obviously you can go the wrong way here and single expressions rapidly get too hard to read also.

Putting the same thing in modules and maintaining it for a while:

module "c" {
  source = "../functions/bar"
  c = module.f.y
}

module "a" {
  source = "../functions/bob"
  a = bonno
}

module "d" {
  source = "../foo"
  d = module.c.z
}

module "b" {
  source = "../functions/baz"
  f = module.c.z
}

module "f" {
  source = "../functions/baz"
  f = module.a.x
}

At first glance you have next to zero knowledge of what this does or what the execution order is.

It's hard to even notice that a "spare" module got left behind by previous maintenance. Could you even find the "spare" in only a couple of seconds now I've told you?

schollii commented 1 year ago

Not sure why I didn't link #28339 to this, here it is now.

It pushes my comment https://github.com/hashicorp/terraform/issues/27696#issuecomment-808201319 a little further, and that comment got a lot of upvotes indicating that maybe #28339 is a good way to advance discussion?

celik0311 commented 1 year ago

Looks like the underlying DSL used here HCL (Hashicorp Configuration language) actually supports user-defined functions -- So it might just be a matter of integrating this with Terraform itself (as the calling application).

Are there any examples of using this method?

esdccs1 commented 1 year ago

so is this article making stuff up https://www.devopsschool.com/blog/detailed-guide-for-how-to-write-a-custom-function-in-terraform/

Or is it now possible to define custom functions in our terraform code?

schollii commented 1 year ago

I think it is rubbish. Seems like the author based this on a chatGPT hallucination.

Plus the mechanism doesn't actually make sense as described: it says to put the code in a .tf file and that any language can be used, yet there is no mention of the language used in the .tf file shown as example. Also see https://developer.hashicorp.com/terraform/language/functions which is the latest, "The Terraform language does not support user-defined functions". So the only way this could be true is yet-to-be-announced functionality.

I posted a comment, I'll see if it gets rejected.

apparentlymart commented 1 year ago

Indeed, that article is describing language features that do not exist and never have existed.

couling commented 1 year ago

I reached out the the above website and the author agreed to take down the page claiming it was "half cooked content and work was in progress". IMHO it looked like AI generated content. For future readers curious about the article's content, it can still be found via the "Wayback Machine - Internet Archive".

Satak commented 8 months ago

Is this now finally coming?

image

crw commented 8 months ago

@Satak It is being explored, but there is not a final design for it yet. Thanks!

jbardin commented 3 months ago

Closed via #34394

martinrohrbach commented 3 months ago

@jbardin Correct me if I'm wrong but that "only" adds provider defined functions to terraform. Sounds great, but that's not really what was discussed here, is it? The original intent it seems to me was to define functions in the terraform code itself.

I would love to use functions as well, that's why I subscribed. But I am not able to write a terraform provider myself sadly.

g13013 commented 3 months ago

@jbardin Even though the solution is great, this issue should be reopened as this solution solves nothing on this issue!

jbardin commented 3 months ago

Hello all,

We can re-open the issue, but It would be good to make sure the issue matches expectations. HCL is not a general purpose programming language, so if there were an ability to generate and call functions within the language it would likely be quite limited. There is still value in partial application and function currying however, but that might be the extent of what we could expect natively. The RPC layer is where we've chosen to extend the function interface, and will be what's available for the foreseeable future.

markus-ap commented 1 month ago

What I most often miss when writing in terraform is custom functions for maintaining my naming conventions. Almost every resource has a name, and most people probably have some convention in how they name them.

Custom functions for handling strings, ints, and other primitives that often are used as input into resource arguments seems as a nice limit. They cannot have any side-effects, they simply return something like a string or int for the user to use as an argument.

resource "azurerm_user_assigned_identity" "identity" {
  name                = follow_naming_convention(local.system, var.environment, local.subsystem, "id")
  ...
}
schollii commented 1 month ago
resource "azurerm_user_assigned_identity" "identity" {
  name                = follow_naming_convention(local.system, var.environment, local.subsystem, "id")
  ...
}

Great example : the format() provides a nice way to, combine these BUT if you use the same format call in many places, you don't have DRY code: if you want to change the format you have to find all places where it is used and change each one identically. And clearly, creating a custom function in a provider is not practical. You need a custom function you can define in the terraform code.

apgmckay commented 1 month ago

In the case given above refering to DRY code, you can just assign your format call to a local and use the local.

schollii commented 1 month ago

@apgmckay no you can't, because different values may be given to the function, that adhere to the format. You would need one local per resource that you want to name according to the desired conventions. Not practical.

apgmckay commented 1 month ago

@schollii indeed apologies I miss understood your given example.

The way I have handled this before is using a data provider and taking a hash sum of all/partial of the input contents.

But yes some for of custom function would be nice, especially if you could for them if composites of the existing tf functions. I think this would guarantee them as pure functions 🤔

swordfish444 commented 1 month ago

@jbardin Are there any new updates regarding this product feature? This feels like a huge opportunity for Hashicorp. Considering opening MR into OpenTofu to introduce this.

crw commented 4 weeks ago

@swordfish444 per https://github.com/hashicorp/terraform/issues/27696#issuecomment-1986479588

The RPC layer is where we've chosen to extend the function interface, and will be what's available for the foreseeable future.

This is unlikely to change in the near-to-medium future.