Can't index from schema.Set

jeremy-asher commented 8 years ago

Terraform Version

Terraform v0.7.7

Affected Resource(s)

Any resources with list or map attributes containing list or map values. This is particularly bad for the aws_ami data source's block_device_mappings attribute.

Terraform Configuration Files

provider "aws" {
  region = "us-west-2"
}

data "aws_ami" "test" {
  name_regex  = "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-.*"
  owners      = ["099720109477"]
  most_recent = true
}

# Use square brackets to get the 2nd element of block_device_mappings then use
# lookup to get the value of "device_name" from the returned map
output "test1" {
  value = "${lookup(data.aws_ami.test.block_device_mappings[1], "device_name")}"
}

# As test1, but using element instead of square brackets
output "test2" {
  value = "${lookup(element(data.aws_ami.test.block_device_mappings, 1), "device_name")}"
}

# Use square brackets to get the 2nd element of block_device_mappings then use
# lookup to get the value of "ebs" from the returned map (itself a map), then
# use lookup again to get the value of "snapshot_id"
output "test3" {
  value = "${lookup(lookup(data.aws_ami.test.block_device_mappings[1], "ebs"), "snapshot_id")}"
}

Debug Output

https://gist.github.com/jeremy-asher/ff97ae0c053cc3b6134b8a8485717595

Expected Behavior

The block_device_mappings attribute is processed as a nested list of maps with values being a mixture of strings and maps. The outputs are returned as follows:

Outputs:

test1 = /dev/sdb
test2 = /dev/sdb
test3 = snap-da82e79f

Actual Behavior

The interpolation of the attribute produces an empty list and no outputs are produced.

Steps to Reproduce

terraform apply

b-dean commented 8 years ago

It appears that you can get at them with the hashes for the block_device_mappings:

output "test4" {
  value = "${data.aws_ami.test.block_device_mappings.2547816212.ebs.snapshot_id}"
}

That hash is calculated here: https://github.com/hashicorp/terraform/blob/v0.7.8/builtin/providers/aws/data_source_aws_ami.go#L453-L478

But this doesn't really seem like an acceptable way to access this stuff. I want to have the data.aws_ami so I can find an ami and get some information about it. If I have to already know the hash of this block device information, then what's the point? I might as well just put the snapshot id in a variable.

There must be some correct way in HCL to access items in Sets.

jeremy-asher commented 8 years ago

Yes, it's interesting that the hash lookup succeeds, but not particularly useful.

mitchellh commented 8 years ago

So I think the issue here is that block_device_mappings is a set and not a list. A set by definition has no ordering so accessing it with [1] is an uncertain operation: what is the "1st" element in a set?

I think what would be better is if our output was perhaps a computed map and the key was the device name. Unfortunately we don't support maps with complex values at the moment, though. :(

b-dean commented 8 years ago

What would be the consequences of having amiBlockDeviceMappingHash just return m["device_name"]?

mitchellh commented 8 years ago

That is saying that the uniqueness of the entire value is determined by only the device name. That's a bit out of my understanding since I think the uniqueness is determined by a bit more but I'll lean on the AWS provider folks for that.

jeremy-asher commented 8 years ago

So I think the issue here is that block_device_mappings is a set and not a list. A set by definition has no ordering so accessing it with [1] is an uncertain operation: what is the "1st" element in a set?

It seems like sets get interpolated like lists. I just noticed a list_of_map attribute was added to test_resource so here's another example (using 0.7.10):

Config

resource "test_resource" "test" {
  required = "string"
  required_map = {
    foo = "bar"
  }

  set = ["foo", "bar"]
  list_of_map = [
    {
      foo = "bar"
    },
    {
      bar = "baz"
    }
  ]
}

output "test1" {
  value = "${test_resource.test.set[0]}"
}
output "test2" {
  value = "${test_resource.test.list_of_map[0]}"
}

Apply run

test_resource.test: Creating...
  computed_list.#:              "" => "<computed>"
  computed_map.%:               "" => "<computed>"
  computed_read_only:           "" => "<computed>"
  computed_read_only_force_new: "" => "<computed>"
  computed_set.#:               "" => "<computed>"
  list_of_map.#:                "" => "2"
  list_of_map.0.%:              "" => "1"
  list_of_map.0.foo:            "" => "bar"
  list_of_map.1.%:              "" => "1"
  list_of_map.1.bar:            "" => "baz"
  optional_computed_force_new:  "" => "<computed>"
  optional_computed_map.%:      "" => "<computed>"
  required:                     "" => "string"
  required_map.%:               "" => "1"
  required_map.foo:             "" => "bar"
  set.#:                        "" => "2"
  set.1996459178:               "" => "bar"
  set.2356372769:               "" => "foo"
test_resource.test: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate

Outputs:

test1 = bar

Log

2016/11/11 08:44:00 [WARN] Output interpolation "test2" failed: At column 3, line 1: list "test_resource.test.list_of_map" does not have any elements so cannot determine type. in:

${test_resource.test.list_of_map[0]}

mitchellh commented 8 years ago

@jeremy-asher I believe that's only if you manually set the set since then we can assume the ordering you want is what you explicitly provided. I am actually surprised this worked though.

jeremy-asher commented 8 years ago

@mitchellh Is resolving the list of maps type case in the works at all? I'd be willing to look into it, but it's a reasonably deep issue from what I've seen looking through the interpolation code. In the short term, do you know of any workarounds for this that would allow #9891 to work?

jeremy-asher commented 7 years ago

@jbardin thanks for the fix, though it doesn't seem to fully cover the original case in the description. I did some testing on master and it looks like the fact that block_device_mappings is a set causes part of the problem. I copied the schema into test_resource and changed it to a List. That fixed the interpolation, but it wasn't possible to use lookup or square bracket indexing to dig ebs out of the structure, let alone any of it's sub keys. I'm guessing this will also require a change to the interpolation functions or to the HIL definition to allow sequential square bracket lookups.

jbardin commented 7 years ago

@jeremy-asher,

Sorry, this auto-closed but I agree there's a couple of related issues. The first being block_device_mappings is a schema.Set which is unordered and not meant to be indexed. Since a Set's order is deterministic, we may be able to properly support indexing, and I'll reopen this for that specific issue. (Another option is to change the data structures to lists and maps now that we can nest them, but in some cases a Set may still be more appropriate. This might even need to be whatever type we derive for the "nested resources" problem.)

Being able to chain multiple index operations is a separate issue for HIL.

arhea commented 7 years ago

I ran into this issue today and +1 for making the block_device_mappings more accessible. A little description of my use case:

Packer creates a set of AMI's that I dynamically query with Terraform
I would like to the keep the configuration of the EBS volume sizes, types, names within Packer and just consume those values within Terraform.

Additional Request:

The ability to loop through all the block devices on an AMI and dynamically create a launch configuration.

~ terraform version
Terraform v0.8.1

jeremy-asher commented 7 years ago

FYI, @jbardin #11042 does lead to a bit of progress here. I've got an additional test case that now works which probably indicates things will work once some HIL changes are made.

With this output added to the code in the description:

output "test4" {
  value = "${data.aws_ami.test.block_device_mappings[1]}"
}

On 0.8.2 results in:

Outputs:

test4 =

On 0.8.3 results in:

Outputs:

test1 = /dev/sda1
test4 = {
  device_name = /dev/sda1
  ebs = map[encrypted:0 delete_on_termination:1 volume_type:gp2 iops:0 snapshot_id:snap-0013c62804ea13d4f volume_size:8]
  no_device =
  virtual_name =
}

This means the string value in a set of map can now be interpolated. The map value probably only fails due to the flat map restriction. Adding support for multiple square bracket lookups in series would probably also work.

jeremy-asher commented 7 years ago

Thanks for your work on hashicorp/hil#42 @apparentlymart. That looks like the HIL side of the fix for this issue.

jeremy-asher commented 7 years ago

Looks like the above HIL PR was not merged. @apparentlymart have there been any other changes to allow the index operator to work for this kind of case?

apparentlymart commented 7 years ago

Hi @jeremy-asher,

That PR did not directly address the issue described here as far as I can tell, because the problem still remains that a set element has no "index" or "key" with which to look it up.

I don't have a quick short-term fix for this one, but we've been working on some early design/discovery for various improvements to the configuration language, and I think some of the stuff we have in there would get to what you need here, by allowing filtering down to a single-element set based on some condition (e.g device_name == "/dev/sda1") and then converting the result to a single-element list which could then be indexed with [0]. I don't have concrete details on this to share right now, but I'll make a note of this use-case so we can remember to consider it as we get more into specifics.

jeremy-asher commented 7 years ago

@apparentlymart,

I'm not sure if it's intended, but I have been able to index into the set like a array using the square bracket notation (see my "test4" example above). This attribute may or may not make sense to be implemented as a set, but even if it wasn't a set, there's no supported language feature to get 3 levels deep to the data I want (see "test3") as far as I know which was why I originally opened this issue.

apparentlymart commented 7 years ago

Hi again, @jeremy-asher!

The fact that there's no predictable ordering for sets makes that risky even if it is working, but I suppose as long as you only have a one-element set it's fine... I worry a little that this might get broken by future language changes, since that operation is only really working by accident now. :confounded:

Being able to index multiple times, like foo[0][1][2] etc is indeed what hashicorp/hil#42 was about, though it turned out that this change had some unintended consequences and so instead of continuing to fight the existing system incrementally we'd decided to start looking at the more holistic design work I mentioned before. Indexing the result of an index is definitely one of the big things on the list, since it's the use-case that got us moving down this path in the first place.

Unfortunately I have to just ask for some patience while we work through this, since the existing interpolation language has some assumptions deep in its design that make this tricky, and so we need to do more foundational work before we can get there.

jeremy-asher commented 7 years ago

@apparentlymart okay thanks for the update. I appreciate that it's a pretty big change, but I wanted to check whether it was still on the agenda. Agreed that the set indexing is probably a bad idea regardless. Is there any reason why these various attributes are sets in the first place? I suspect AWS treats them as having ordering, but I could be mistaken.

apparentlymart commented 7 years ago

Indeed it's likely a historical error that the ebs volumes on the data sources are sets, originating from the fact that the data source schemata were based on the resource schemata, and AWS apparently doesn't apply ordering to these things. (There is an implied ordering by the device_name.)

One thing we probably could do here is to make the data sources sort by something dependable -- in this case, the device_name -- and make it a list. Since we never have to diff data sources against config, it's less important get the ordered-ness right for data sources. I think we could probably do that without a breaking change too, since again we don't diff data sources against config so we can easily change their state representation.

At this point it feels like this issue is representing a few different things. We should probably split it out into a few different issues since they are all legitimate but different levels of effort. If the answer is to change the data sources to use lists then that's something we'd do over in the aws provider repo.

I'm not sure which one of these to consider this issue to be. If you have a preference then let me know and we can make the other two. :grinning:

jeremy-asher commented 7 years ago

Yes, that was a bit off topic, but it could help a bit. I'm more concerned about the core language changes which I'm sure will take longer to implement. It sounds like that's not quite at the stage of opening a GitHub issue, though. Was there another component to this issue you were referring to?

Regarding the attribute types, there are a number of other AMI related (non-data) resources with similar attributes. I initially encountered this issue with the aws_ami_copy, though I used the data source in the example. It's also worth noting that it's unlikely or possibly impossible that anyone is using these attributes due to this issue anyway.

evansj commented 6 years ago

Terraform v0.11.7
+ provider.aws v1.9.0
+ provider.null v1.0.0

I found this issue while trying to work out how to parse aws_ami block_device_mappings. My intention was to workaround another "bug", whereby my ec2 instance gets recreated every time because I haven't specified a snapshot_id in my ebs_block_device block. I thought that I'd just be able to query the correct snapshot_id from the aws_ami data I'm already using.

The closest I've come so far is by returning "${jsonencode(data.aws_ami.ecs_ami.block_device_mappings)}", which looked useful. This is what I got, cleaned up and formatted:

[{
  "device_name": "/dev/xvda",
  "ebs": {
    "delete_on_termination": "1",
    "encrypted": "0",
    "iops": "0",
    "snapshot_id": "snap-05195bed51380c7b1",
    "volume_size": "8",
    "volume_type": "gp2"
  },
  "no_device": "",
  "virtual_name": ""
}, {
  "device_name": "/dev/xvdcz",
  "ebs": {
    "delete_on_termination": "1",
    "encrypted": "0",
    "iops": "0",
    "snapshot_id": "snap-0046522873778f9e0",
    "volume_size": "1024",
    "volume_type": "gp2"
  },
  "no_device": "",
  "virtual_name": ""
}]

This proves that the data is there. The value I want is the snapshot_id for the /dev/xvdcz device. Everything I've tried to actually index into this data structure seems to fail though.

Firstly, I can't tell terraform to ignore the change in snapshot_id because this syntax doesn't work:

resource "aws_instance" "foo" {
  ebs_block_device {
    device_name = "/dev/xvdcz`
   [... other properties ...]
  }
  [...]
  lifecycle {
    ignore_changes        = ["ebs_block_device.*.snapshot_id"]
  }
}

I can't just use ignore_changes = ["ebs_block_device"] because this is a module, and I do want to recreate the resource if for example I change the size of the device.
I can't just return the json and then extract the value I need from that because there's no jsondecode() function.
I can't output the whole Map (Set?) from my module, and then index into it with element():

* module.ecs_cluster.var.docker_volume_snapshot_id: element: element() may only be used with flat lists, this list contains elements of type map in:

${element(module.ecs_ami.block_device_mappings, 0)}

It looks like it might work to index with [0] but then the value I get back seems to be useless:

locals {
  first_block_device_mapping = "${module.ecs_ami.block_device_mappings[0]}"
}

[...]

  docker_volume_snapshot_id = "${local.first_block_device_mapping["device_id"]}"

Error: Error refreshing state: 1 error(s) occurred:

* module.ecs_cluster.var.docker_volume_snapshot_id: At column 35, line 1: map "local.first_block_device_mapping" does not have homogenous types. found TypeMap and then TypeString in:

${local.first_block_device_mapping["device_id"]}

Is there any way for me to get the snapshot_id for the /dev/xvdcz device?

b-dean commented 6 years ago

All this information is easily obtained from the AWS cli, so if you have that available you could have a script run aws cli to look up your AMI, get the block device mappings and format it with jq. Then just use that in your terraform with a data.external

#!/usr/bin/env bash
set -euo pipefail

aws ec2 describe-images --image-id "$1" | jq -r '[.Images[0].BlockDeviceMappings[] | {(.DeviceName) : .Ebs.SnapshotId}] | add'

and then in your terraform:

data "aws_ami" "foo" {
  # .. some stuff to find the right AMI
}

data "external" "ami_snapshots" {
  program = ["${path.module}/files/ami_snapshots.sh", "${data.aws_ami.foo.id}"]
}

resource "aws_instance" "bar" {
  ami = "${data.aws_ami.foo.id}"

  ebs_block_device {
    device_name = "/dev/xvdb"
    snapshot_id = "${data.external.cache_snapshots.result["/dev/xvdb"]}"
  }
}

Or something along those lines. It's a hacky workaround until such time as data.aws_ami has the block device mappings available in a way that's useful.

evansj commented 6 years ago

@b-dean that is ingenious! I'm now using something based on your solution. I wish I knew about the external data source earlier. Thanks for your help.

jordan-evans commented 6 years ago

Trying to do the exact same thing as @evansj External Datasource is a good workaround though, but really just want to do this all inside Terraform :(

apparentlymart commented 6 years ago

Hi all!

There are some changes coming that will eventually address this in a few different ways.

The change that will arrive first is the introduction of for expressions in the forthcoming Terraform 0.12, which where mainly intended for projecting lists and maps, but due to automatic type conversions can be used to work around the fact that indexes don't have keys, albeit in a way that is not the most readable:

  # details may change before release
  snapshot_id = [for m in aws_ami.block_device_mappings: m if m.device_name == "/dev/xvdb"][0]

The above iterates over all of the set elements, filters to just the one that has the given device name and producing a list, and then taking the first (and only) element of that list. This above will at least make it possible to access a particular element in a set, though I'll be the first to admit that it's pretty convoluted.

For the longer term, we're also introducing some changes to the provider API in 0.12 which will allow providers to support maps with complex element types, as Mitchell hinted at earlier in the thread. This will not have any immediate effect for 0.12 because the plugin SDK and the providers themselves must first also be updated to make use of this new capability, but in some subsequent release of the AWS provider (determined by the AWS provider team, once the SDK is updated) we could see these exposed as a map with the device name as a key.

The pattern of returning "computed" values (values defermined by the provider) in complex-typed sets will be deprecated and eventually phased out across all of the providers, since sets were introduced as a construct for representing values from configuration, and are not well-suited to exporting values to be referenced elsewhere for the reasons discussed above. As I'm sure you can imagine, this will be a gradual process as each provider team figures out the best way to update their resource schemas, so the pattern of filtering a set using a for expression will be our workaround until we reach that end state across all resource types.

jbardin commented 3 years ago

Hello All!

The aforementioned for expressions have been implemented in Terraform for some time now, and one can now filter set values in this way. The set type as implemented in Terraform itself is an unordered data structure, there is no possibility of directly indexing the value.

Since the original issues here related to implementation details of the legacy SDK, which is no longer used in Terraform core itself, there is nothing else to be done here and we can close the issue.

Thanks!

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform