hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.58k stars 9.54k forks source link

Validate each list item in a Custom Validation Rule condition #26205

Open piersf opened 4 years ago

piersf commented 4 years ago

Current Terraform Version

Terraform v0.13.1

Use-cases

With the release of Terraform v0.13.1 we started using a lot Custom Validation Rules to check whether an input value for a variable has the expected value/format.

We've noticed that with Custom Validation Rules there is no way(from what I've seen) to validate that all items of a list variable have an expected value or format.

Attempted Solutions

I have tried the below solutions to validate that each item of a list holding CIDR blocks conforms to the CIDR block format of x.x.x.x/x.

The below does not work as the variable var.private_subnets is a list and it would require to loop over each list item. But i can't find a way to do that in a validation block. If, however, I specifically specify the index of an item like var.private_subnets[0] in the condition attribute, it works fine but obviously that will check only one of the items. I could of course specify the validation block multiple times, one time for each list item, but that is not dynamic at all.

variable "private_subnets" {
  type = list
  default = ["10.10.108.0/25", "10.10.108.128/26", "10.10.108.192/26"]

  validation {
    condition     = can(regex("^([0-9]{1,3}\\.){3}[0-9]{1,3}(\\/([0-9]|[1-2][0-9]|3[0-2]))?$", var.private_subnets))
    error_message = "Each item of the 'private_subnets' List must be in a CIDR block format. Example: [\"10.106.108.0/25\"]."
  }
}

Is there currently a way to achieve what I am trying to do?

Proposal

If the above is not possible currently, how about adding support and make it possible to have a validation block be able to check somehow all list items? One way I thought this could be done is to make a validation block be dynamic so we can work then with the for_each blocks. An idea would be something like the below, which does not work when I tried it(throws errors about unsupported blocks):

variable "private_subnets" {
  type = list
  default = ["10.10.108.0/25", "10.10.108.128/26", "10.10.108.192/26"]

  dynamic "validation" {
    for_each = var.private_subnets
    content {
      condition     = can(regex("^([0-9]{1,3}\\.){3}[0-9]{1,3}(\\/([0-9]|[1-2][0-9]|3[0-2]))?$", private_subnets.value))
      error_message = "Each item of the 'private_subnets' List must be in a CIDR block format. Example: [\"10.106.108.0/25\"]."
    }
  }
}

Thank you!

apparentlymart commented 4 years ago

Hi @piersf! Thanks for sharing this use-case.

A way to get this done with current Terraform is to write a for expression in the condition argument, like this:

  condition = can([for s in var.private_subnets : regex("^([0-9]{1,3}\\.){3}[0-9]{1,3}(\\/([0-9]|[1-2][0-9]|3[0-2]))?$", s)]

Because the for expression as a whole will fail if any of its repeated evaluations fail, can will still return false in that case.

What this does not do is allow the error message to give details about which of the elements is invalid, which is non-ideal. Perhaps something like what you proposed here would allow Terraform to give better feedback about which element was invalid, though it would still be limited only to reporting errors in the first level of the variable's data structure, rather than at arbitrary points in the data structure, so we may wish to investigate a more complete solution if we can find one that doesn't result in an explosion of complexity.

piersf commented 4 years ago

@apparentlymart thank you for the example! It works quite well. For the time being, we are good if the error message does not say which element is invalid. That is a minor thing.

kiddom-kq commented 3 years ago

I am a +1 for this as well. My use case is very similar to @piersf:


variable "security_groups" {
  type        = list(string)
  description = "(optional) list of security groups to apply to spun up nodes"
  default     = []

  # Ideally, each string in the list would match the pattern "sg-[alphnumeric]". Since we don't know if AWS will
  # every increase the number of digits in an SG-id, using length() may not age well. Regex() is likely to be good enough
  # if the expression contains a minimum length requirement...
}

I think that the for inside of can() with regex() doing the heavy lifting will work for my simple use-case but I have a few other variables that take in a list of objects where each object is expected to have a few fields:

dynamic "some_block" {
    for_each = var.some_list_of_maps
    content {
      property_1      = lookup(some_block.value, "property_1", null)
      <...etc...>
    }
}

We have decent automated documentation that gives the user a clue as to which keys need to go in the map and what the value should be (int, string...etc) but there's no way to require that each map in the list have only the keys property_1 and property_2 and property_3. Ideally, for each map object in the list: property_1 would map to a string and property_2 would map to a positive integer and property_3 would be a positive integer at least 2x bigger than property_2.

grahamhar commented 3 years ago

I am a +1 for this as well. My use case is very similar to @piersf:


variable "security_groups" {
  type        = list(string)
  description = "(optional) list of security groups to apply to spun up nodes"
  default     = []

  # Ideally, each string in the list would match the pattern "sg-[alphnumeric]". Since we don't know if AWS will
  # every increase the number of digits in an SG-id, using length() may not age well. Regex() is likely to be good enough
  # if the expression contains a minimum length requirement...
}

I think that the for inside of can() with regex() doing the heavy lifting will work for my simple use-case but I have a few other variables that take in a list of objects where each object is expected to have a few fields:

dynamic "some_block" {
    for_each = var.some_list_of_maps
    content {
      property_1      = lookup(some_block.value, "property_1", null)
      <...etc...>
    }
}

We have decent automated documentation that gives the user a clue as to which keys need to go in the map and what the value should be (int, string...etc) but there's no way to require that each map in the list have only the keys property_1 and property_2 and property_3. Ideally, for each map object in the list: property_1 would map to a string and property_2 would map to a positive integer and property_3 would be a positive integer at least 2x bigger than property_2.

@kiddom-kq Would this work for you (tested in terraform 1.0.1 only) using an object you can define which fields have to passed and the type required. then there is only a need to validate the numeric constraint?

variable "example" {
  type = object({
    property_1 = string
    property_2 = number
    property_3 = number
  })
  description = "A thing"
  validation {
    condition = var.example.property_2 >= 0 && tostring(floor(var.example.property_2)) == tostring(var.example.property_2)
    error_message = "You must provide a property_2 as a positive int."
  }
  validation {
    condition = var.example.property_3 >= 0 && floor(var.example.property_3) == var.example.property_3 && var.example.property_3 / var.example.property_2 >= 2
    error_message = "You must provide a property_3 key and it must be at least 2x the value of property_2."
  }
}
patpicos commented 1 year ago

Hi @piersf! Thanks for sharing this use-case.

A way to get this done with current Terraform is to write a for expression in the condition argument, like this:

  condition = can([for s in var.private_subnets : regex("^([0-9]{1,3}\\.){3}[0-9]{1,3}(\\/([0-9]|[1-2][0-9]|3[0-2]))?$", s)]

Because the for expression as a whole will fail if any of its repeated evaluations fail, can will still return false in that case.

What this does not do is allow the error message to give details about which of the elements is invalid, which is non-ideal. Perhaps something like what you proposed here would allow Terraform to give better feedback about which element was invalid, though it would still be limited only to reporting errors in the first level of the variable's data structure, rather than at arbitrary points in the data structure, so we may wish to investigate a more complete solution if we can find one that doesn't result in an explosion of complexity.

Another approach is to use the alltrue function. Example from my module

  validation {
    condition = alltrue(flatten([
      for subnet in var.subnets : [
        for rule in subnet.nsg_security_rules :
        rule.priority >= 200 && rule.priority <= 4000
      ]
    ]))
    error_message = "NSG Security rule priorities should be between 200 and 4000 (inclusive). Ranges outside this have been reserved"
  }
apparentlymart commented 1 year ago

Thanks for sharing that @patpicos! I think most of the discussion above happened before the addition of alltrue, but indeed that function is a good way to aggregate a dynamic number of boolean results.

apparentlymart commented 1 year ago

I have a feeling that we might have a duplicate or overlapping issue for this somewhere because I remember sharing another possible future design idea somewhere else, which I'm going to repeat here just so we can find it again if we revisit this issue.

Since the original implementation of validation we've gradually loosened what is allowed in its arguments, including the possibility of more complex expressions in error_message which means that in principle you can write an expression to generate a more detailed error message.

However, folks rarely do that in practice because it tends to involve duplicating large parts of the condition expression but modified slightly to produce a collection of invalid values rather than a collection of boolean results to use with alltrue.

One way to address that annoyance would be to allow for more tightly-scoped local values, including specifically inside a validation block in which case the value expressions could refer to anything that the condition and error_message arguments would normally be allowed to refer to, along with other local values that are defined inside the same validation block:

  validation {
    # INVALID: This is a hypothetical future language feature,
    # not supported in today's Terraform.
    locals {
      subnet_rules = flatten([
        for subnet in var.subnets : [
          for rule in subnet.nsg_security_rules : {
            subnet_id = subnet.id
            rule      = rule
          }
        ]
      ])
      invalid_priority = toset([
        for rule in local.subnet_rules : rule.subnet_id
        if rule.priority < 200 || rule.priority > 4000
      ])
    }
    condition = length(local.invalid_priority) == 0
    error_message = [
      for subnet_id in var.invalid_priority :
      "Subnet ${subnet_id} has a rule with an invalid priority. All rule priorities must be between 200 and 4000 inclusive."
    ]
  }

The above also introduces the idea of error_message being allowed to be a sequence of error messages, which perhaps Terraform would then emit as multiple separate diagnostics rather than just one diagnostic. But that's not really the main point here; the main thing is to introduce the concept of multiple local value scopes and treat local.foo references differently depending on where they appear, binding to the closest appropriate scope.

That is a non-trivial feature to implement -- currently local values are just a single flat table per module -- but back in Terraform v0.12 we did reserve the name locals inside resource and data blocks in anticipation of possibly supporting per-instance local values there too, so it might be worth the effort to introduce this more general building block to help solve a number of different problems using a single concept.

As far as I know nobody has actually attempted a feasibility prototype of the above idea yet, so this isn't a promise that something like the above is planned or definitely possible, but I personally think it's worth exploring.