hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.5k stars 9.52k forks source link

Data sources should allow empty results without failing #16380

Closed clstokes closed 5 years ago

clstokes commented 6 years ago

This is an enhancement request. Data sources should allow empty results without failing the plan or apply. This would allow for fallback or fallthrough scenarios that are needed in certain situations. An optional allow_empty_result attribute on the data source would support many use cases where a datasource value is not explicitly required. The specific attribute name isn't important to me.

As an example, the configuration below currently fails because the aws_ami datasource will not find any results. However I still want the apply to continue using a fallback AMI.

data "aws_ami" "main" {
  most_recent = true

  filter {
    name   = "name"
    values = ["blah-blah"]
  }
}

output "ami" {
  value = "${coalesce(data.aws_ami.main.id,"ami-abc-123")}"
}

(This is just an example so please don't get hung up on this specific use case.)

dohoangkhiem commented 6 years ago

We desperately needed this for a long time, couldn't find an effective way to deal with such resources that could be created only once (for example, security group in AWS) and used later in other configurations (while creating a separate configurations for such shared resources are not reasonable this case).

rwaffen commented 6 years ago

i am sitting right at the same problem. would really be helpful if empty returns where possible! (and yes, i am right at that "search ami and don't find one, but want to have backup value"-problem 😏 )

sclausson commented 6 years ago

Agree that this would be useful. I would like to use "aws_ebs_snapshot" data source to look for a snapshot that matches on some filters and create a volume from it. But if none is found, then I would like to create a volume from a default snapshot id.

shazChaudhry commented 6 years ago

Could this be a valid solution: #11782 ?

ryanking commented 6 years ago

@shazChaudhry that appears to only work on maps, yeah?

practicalint commented 6 years ago

I have hit this a few times. But my latest use case is in trying to get around the way that Plan cannot tell that a resource will be created in the apply when the filter values are computed, in particular when using modules. Like I am trying to build a security group list from a map using names and use a security group data source filtering on name to find the corresponding IDs. If a security group is one that will be created in the same Apply, it fails the Plan to "no matching SecurityGroup found". In my case even though I am using some maps, the possible solution above does not work, unless I were to play games with some default security group it would find in place of looking for a to-be-created resource. I have tried several alternative approaches to get around the various limitations I described, but like I run into all the time, each one lands on another limitation. For example, I came up with an elegant way to filter the to-be-created SGs out of the list so lookup would not be attempted in the Plan (and then I could merge the to-be-created IDs with the data resource found IDs), but that makes the resulting list computed which won't allow me to use it's length as the count variable.

I have a fair understanding of the terraform model and that the plan needs to create the graph to know what will really happen in Apply, but overall it continues to be difficult to create dynamic and DRY terraform "code" to deal with more complex/sophisticated provisioning scenarios.

steverukuts commented 6 years ago

I have the same problem:

  1. My ECS [0] services and task definitions are provisioned using Terraform
  2. An script runs as part of our CI process to update the task definition to another version

This sequence of events happens:

  1. I create the services and task definitions using Terraform. The task definition revision is 1
  2. The CI process executes and updates the task definition to 2. It deletes revision 1. The script copies the content of the task definition so it's identical. [1]
  3. I execute Terraform apply. It attempts to set the task definition revision back to 1
  4. Terraform bails out with an error

Now, I could use a data block to pull in the latest version of my task definition. Then I can do what the ecs_task_definition docs says and use ${max(... to get the latest revision.

However, if I am creating the environment for the first time this is not possible. This does not work if you are creating the environment for the first time, but it will on subsequent runs:

resource "aws_ecs_task_definition" "mongo" {
  family = "mongodb"
  # ...
}

data "aws_ecs_task_definition" "mongo" {
  task_definition = "${aws_ecs_task_definition.mongo.family}"
}

I worked around it by using a templating library (mustache) to generate my terraform resources and it only includes the data resource when we are not creating the environment for the first time. I've generated other parts of my configuration before to workaround the lack of foreach in the language (after all, Terraform is not a scripting language) but this is the first time I've had to write scripts that pay attention to state.

It would appear that data blocks are only useful if the referenced resource was created outside the environment manually or by another Terraform configuration. I understand the logic of throwing an error if the resource does not exist as a default but I feel we need an option to allow it to return a null resource. A warning in the output seems like an acceptable middle ground instead.

I'd submit a PR for that behaviour myself but I'd like to first see if there's a difference between how I'm using the software and how the maintainers intended it to be used. Perhaps my use-case is too weird for Terraform to cover it, or perhaps I'm meant to integrate Terraform more deeply in my CI pipeline. Aside from that, it's done everything else quite marvellously so far.

[0] Amazon Elastic Container Service [1] For those unfamiliar with ECS, that's really how you have to do it. I wish it wasn't necessary to change the state of anything but there is no "gracefully reload my services please" API call.

steverukuts commented 6 years ago

I was actually able to work around this problem using Terraform and I no longer need to use a templating library to do what I described. I need to add a depends_on parameter to the data block:

resource "aws_ecs_task_definition" "mongo" {
  family = "mongodb"
  # ...
}

data "aws_ecs_task_definition" "mongo" {
  task_definition = "${aws_ecs_task_definition.mongo.family}"
  depends_on = ["aws_ecs_task_definition.mongo"]
}

This has the downside that every time I run terraform plan, it lists the dependent service as a planned change as it doesn't immediately know the state of the mongo data block. However, when I run the configuration, no objects are updated that don't need to be.

I'm a bit confused about why I explicitly have to write depends_on in the data block; I had assumed that referencing the task definition with ${... would be sufficient to take the dependency. It's not even mentioned that you can do this with a data resource in the documentation.

In light of this, perhaps no change is required.

practicalint commented 6 years ago

That is an interesting find on depends_on working in the data block. In my case and I believe others in the thread, the creation of the resource is not necessarily happening in the same root or module and the data block is using more variable data for the look-up. So I think supporting this kind of behavior (allowing not found) should still be pursued.

BlaineBradbury commented 6 years ago

I have the same issue. Even if NULL were not an option, it would be good if I could simply specify a default resultset or value for the data source to return if it finds no results. I could then use this value in combination with ternary logic elsewhere to make some decision.

trilitheus commented 6 years ago

Hitting similar issue - creating a new ASG to take over a previous one, I set the desired count to the number of instances currently in the 'to be deposed' ASG pulled from 'aws_instances' data source. However, if I'm spinning up a new VPC for example, no previous instances exist so I'd like to set desired_count to a default value

mightystoosh commented 6 years ago

Bump. This is a truly needed feature to deal with already created infrastructure:

` data "aws_vpn_gateway" "selected" { filter { name = "tag:Name" values = ["doesnotexist"] } filter { name = "state" values = ["available"] } filter { name = "attachment.vpc-id" values = ["${var.vpc_id}"] } }

resource "aws_vpn_gateway" "vpn_gw" { count = "${1 - signum(length(data.aws_vpn_gateway.selected.id))}" vpc_id = "${var.vpc_id}"

tags { Name = "makeme" } depends_on = ["data.aws_vpn_gateway.selected"]

}`

This should create the resource if it cannot find the data source.

Any better way of doing this?

steverukuts commented 6 years ago

@mightystoosh - for already created infrastructure you can import the entity to your Terraform state.

If you've only got a couple of environments to worry about that's a perfectly fine workflow and might work for you. Given you're talking about a NAT gateway I guess it will. Then, you just treat it as any old resource.

This issue is about situations where a resource may or may not exist and either possibility is totally fine. However, in the situation I described, AWS quietly added a feature to force a redeployment of a service in ECS so this issue no longer affects me personally. That said, I think it would still be useful, such as in @trilitheus' situation.

mightystoosh commented 6 years ago

What is best practices for adding infrastructure? Basically I want to setup a bunch of VPN connections based on a list and connect it to a single VGW. So a single VGW would be created and then I would create a aws_customer_gateway, aws_vpn_connection, and aws_vpn_connection_route(s) for each customer. If I could find existing infrastructure then multiple workspaces might work. If I could have variable names in resources this might work.

It'd be nice if I could iterate over a list in Terraform or even run terraform multiple times using VGW doesn't exists terraform apply -var 'name=TEST' ip=8.8.8.8' -var 'region=us-east-1' VGW now exists terraform apply -var 'name=TEST1' ip=8.8.8.4' -var 'region=us-east-1' VGW doesn't exists terraform apply -var 'name=TEST' ip=8.8.8.8' -var 'region=us-east-2' VGW now exists terraform apply -var 'name=TEST1' ip=8.8.8.4' -var 'region=us-east-2'

I'm new to Terraform and it's nice, but I don't really see a way of adding multiple endpoints easily without copying of code.

practicalint commented 6 years ago

@mightystoosh your workaround example above is feasible in some cases as what I understand from the example is you are guaranteeing at least one item returned and then testing for more than one to know if the desired one is returned. It does requires that there is one resource you know will be there as @steverukuts alluded to. However, several data resources only allow one item to be returned, which then wouldn't fit this scenario. I would also comment that I would think that using the import facility to bring pieces into state as @steverukuts mention is better suited to reference non-TF managed resources until they can be brought into TF control as opposed to just accessing known resources like the data resources provide, but it is an interesting prospect as a work-around perhaps. And finally, @mightystoosh if I understand your use case in your latest post, the IGW would be created first and then customers would be added over time so you can't just iterate a list of customers to add the other components. But if you add the IGW in it's own run and then supply the id or a unique tag that was assigned for it as a var to the add customer component run you should be able to use the data resource to bring the IGW into your run and associate the customer's components to it using data.aws_internet_gateway.id references. You could certainly iterate over a list of regions and supply the IGW vars for both regions in a list to make that happen in the same run.

mikehodgk commented 6 years ago

I'm in the 'AMI lookup' camp. What I want to do is this: if AMI does not exist run packer script to create AMI and check again

BlaineBradbury commented 6 years ago

I won't bother prefacing with all the usual reasons why this hack I'm about to paste is hacky and tightly coupled and will most likely need refactored out once Terraform has a native solution which is in the framework, etc. But maybe someone can use my duct tape for automation testing, non-critical items like I am doing, or stimulate some better creative solution.

resource "aws_sns_topic" "non_critical_alerts" {
  name         = "${var.environment}-non-critical-alerts"
  display_name = "${var.environment}-non-critical-alerts"
}

# Check if SubscriptionArn exists for xyz@domain.com... is NOT NULL/Empty... "-z" -- true/false
data "external" "sns_exists" {
  program = ["bash", "-c", "if [ ! -z \"$(aws sns list-subscriptions-by-topic --topic-arn ${aws_sns_topic.non_critical_alerts.arn} | jq -r '.Subscriptions' | jq '.[] | select((.Endpoint == \"xyz@domain.com\") and (.Protocol == \"email-json\")) | {SubscriptionArn: .SubscriptionArn}')\" ]; then echo '{\"SnsExists\": \"true\"}' | jq '.'; else echo '{\"SnsExists\": \"false\"}' | jq '.'; fi"]
}

# Get SubscriptionArn for xyz@domain.com
data "external" "sns_xyz_SubscriptionArn" {
  program = ["bash", "-c", "${data.external.sns_exists.result.SnsExists == "true" ? "aws sns list-subscriptions-by-topic --topic-arn ${aws_sns_topic.non_critical_alerts.arn} | jq -r '.Subscriptions' | jq '.[] | select((.Endpoint == \"xyz@domain.com\") and (.Protocol == \"email-json\")) | {SubscriptionArn: .SubscriptionArn}'" : "echo '{\"SubscriptionArn\": \"null\"}' | jq '.'"}"]
}

# Remove email-json Subscription for xyz@domain.com
resource "null_resource" "sns_xyz_unsubscribe" {
  triggers {
    force_this_to_run_on_each_apply = "${uuid()}"
  }
  provisioner "local-exec" {
    command = "aws sns unsubscribe --subscription-arn ${data.external.sns_xyz_SubscriptionArn.result.SubscriptionArn} | echo '{}' | jq '.'"
  }
}

# Subscribe xyz@domain.com to the SNS topic
resource "null_resource" "sns_subscribe_xyz" {
  triggers {
    force_this_to_run_on_each_apply = "${uuid()}"
  }
  provisioner "local-exec" {
    command = "aws sns subscribe --topic-arn ${aws_sns_topic.non_critical_alerts.arn} --protocol email-json --notification-endpoint xyz@domain.com"
  }
  depends_on = ["null_resource.sns_xyz_unsubscribe"]
}

This submodule verifies if a certain SNS topic subscription is present by running a bash script inside a conditional to detect if the subscription resource exists. The script is using the AWS CLI along with jq to select for specific dynamic attributes which might be module variables or other values, and also to produce JSON, so Terraform can handle it. The CLI call is wrapped in a simple if/then conditional to manufacture a true/false output based on the returned result.

One functional obstacle for me is that external data sources run immediately on Apply, so the mixture of null_resources using local-exec and the external data sources and their referenced interpolated outputs is used to keep everything in dependency order.

earzur commented 6 years ago

thanks @BlaineBradbury !

i was running into the same kind of issue, start a RDS instance from a previously created final snapshot if it exists...

so i did something like that:

create scripts/check_rds_snapshot.sh:

#!/bin/sh

snapshot_id=$1

if [ -z ${snapshot_id} ]; then
  echo "usage : $0 <snapshot_id>" >2
  exit 1
fi

aws rds describe-db-snapshots --db-snapshot-identifier ${snapshot_id} > /dev/null 2> /dev/null
aws_result=$?

if [ ${aws_result} -eq 0 ]; then
  result='true'
else
  result='false'
fi

jq -n --arg exists ${result} '{"snapshot_exists": $exists }'

then in my template :

data "external" "rds_final_snapshot_exists" {
  program = [
    "scripts/check_rds_snapshot.sh",
    "${local.rds_final_snapshot}",
  ]
}
...
  snapshot_identifier    = "${data.external.rds_final_snapshot_exists.result.snapshot_exists ? local.rds_final_snapshot : ""}"
...
charlesarnaudo commented 6 years ago

Bump. Would be super helpful for determining whether to create an autoscaling group or not.

kanawaden commented 6 years ago

I have a similar issue but with security group. While creating a new environment, I get "data.aws_security_group.XXXXXX_sg: no matching SecurityGroup found".

jhuntoo commented 6 years ago

I just hit this issue too. My use case is for running a job that needs to sync a load balancer with instances that have been created independently, and that may not even exist. I know there are other ways to solve this but this would have been a nice way to do it.

data "aws_instances" "servers" {
  instance_tags {
    X-Service = "my-server"
  }
}

resource "aws_lb_target_group_attachment" "servers" {
  count = "${length(data.servers.ids)}"
  target_group_arn = "${aws_alb_target_group.www_lb.arn}"
  target_id        = "${data.aws_instances.servers.ids[count.index]}"
  port             = 80
}
evanfarrar commented 6 years ago

We just hit this issue too, while hoping to provide a way to add dns records to zones that already exist to give fully working dns to terraform created load balancers, but create the zone if none exists already.

ctippur commented 6 years ago

A related issue. I am trying to restrict access to kms:decrypt on an iam policy. The resource arn needs to have kms key arn based on specific environments (not all regions are part of every environment. For example, integration env regions - us-west-2, eu-west-1; test env - us-west-2; production - us-west-2, eu-west-1, us-east-1. I want to get the KMS key arns for each of these regions dynamically and create an array.

data "aws_caller_identity" "current" {}

data "aws_region" "current" {}

data "aws_kms_key" "kms-us-west-2" { key_id = "arn:aws:kms:us-west-2:${data.aws_caller_identity.current.account_id}:alias/cfgkey", }

data "aws_kms_key" "kms-eu-west-1" { key_id = "arn:aws:kms:eu-west-1:${data.aws_caller_identity.current.account_id}:alias/cfgkey", }

...

resource "aws_iam_policy" "lambda_execute_policy" { ..... "Resource": [ "${data.aws_kms_key.kms-us-west-2.arn}", "${data.aws_kms_key.kms-eu-west-1.arn}", "${data.aws_kms_key.kms-ap-southeast-2.arn}" ]

}

Wondering if there is a way to dynamically create this array?

lblackstone commented 6 years ago

I found a hack that worked for me:

data "openstack_networking_network_v2" "vlan_network" {
  count = "${var.vlan_network_enabled != "0" ? 1 : 0}"
  name  = "foo"
}

resource "openstack_networking_network_v2" "vxlan_network" {
  count          = "${var.vlan_network_enabled == "0" ? 1 : 0}"
  name           = "foo"
  admin_state_up = "true"
}

locals {
  # Hack to work around https://github.com/hashicorp/terraform/issues/15605 and https://github.com/hashicorp/terraform/issues/16380
  network_id = "${var.vlan_network_enabled == "0" ?
  element(concat(openstack_networking_network_v2.vxlan_network.*.id, list("")), 0) :
  element(concat(data.openstack_networking_network_v2.vlan_network.*.id, list("")), 0)}"
}

Explanation: I needed to conditionally select between a network created outside of Terraform, and one that Terraform provisions. The element(concat(...), list(""), 0) statement works around #15605 and using the * operator works around #16380.

4dz commented 6 years ago

My use case is AWS RDS can create a final snapshot when you destroy a database or cluster.

I want to be able to restore from that final snapshot if it exists, else create a blank database.

But if you setup a data source to look for the final snapshot it fails because it doesn’t exist.

Being able to specify a default set of values or even a Boolean (to use in count metadata) would solve this!

Otherwise you have to pass in var’s (as per lblackstone’s solution above) but that means you have to know in advance if the backup exists.

earzur commented 6 years ago

@4dz see my comment above i have a solution to this. Only issue we face with that solution is that when the final snapshot exists at destroy time, the destroy will fail, but we're not doing that on a regular basis.

Just like lifecycle { prevent_destroy = true } not expanding variables, this kind of "comment stuff in code before performing tasks" or "manually remove existing resources before proceeding" makes terraform a bit of a pain to use...

4dz commented 6 years ago

Thanks @earzur . I really want to avoid the external script although I agree it will work for many use cases!

The reason being that, a) You have to have AWS CLI installed (e.g. on CI, locally etc) b) Depending on how you have your profile/environment set up, the scrupt could use the wrong AWS account. c) Probably not ideal for a shareable/reusable module

Only issue we face with that solution is that when the final snapshot exists at destroy time, the destroy will fail

I'm working on a helper module to address exactly that problem! It will work with rds instances but not clusters because there is no aws_db_cluster_snapshot data source - but I just added a pull request for that. Basically - use a counter for the final snapshot identifier which is part of the snapshot identifier e.g. mydatabase-00001

4dz commented 6 years ago

I guess the problem is that it may require an update to every data source file to support allow_empty_result; e.g. because an empty result, a query which returns multiple results, and genuine errors all result in the same error type being returned in the "go" source code.

If that assumption is correct, would an alternative solution be to add a new keyword to terraform core, e.g. check. (or maybe try)

check "aws_ami" "main" {
  most_recent = true

  filter {
    name   = "name"
    values = ["blah-blah"]
  }
}

check/try would take exactly the same inputs as the equivalent data source and call the same go function as a data source would.

It would have 3 outputs,

An additional idea for check/try - you could have an additional input to the construct e.g.allow_errors/ (or catch) - which contains a list of regular expressions which if matched on the error message, will allow the build to continue : so that unmatched/unexpected errors still cause the build to fail.

This is because not all errors are acceptable!

check "aws_db_snapshot" "snapshot" {
        most_recent = "true"
        db_instance_identifier = "${var.instance_id}"

        allow_errors = ["Your query returned no results.*"]
}
ColinHebert commented 6 years ago

@apparentlymart, what are your thoughts on this problem? In the past there was a related issue (#11782) which was more focusing on subfields being potentially unavailable, that was solved with a lookup, but here we're talking about cases were the data resource returns nothing at all, what would be a good approach in your opinion?

BernhardBln commented 6 years ago

I would love this feature, but I would also expect that you have to manually enable this, otherwise you have the default behaviour.

practicalint commented 6 years ago

In regards to the last few responses, I was envisioning that an argument to the data resource would be a boolean like allow_not_found and the result attributes would contain a way to test/reference the count returned, at least similar to how multiple response data resources act today. The default for the arg would be false, handling the concern of manually enabling. Not being a code contributor, I can't say much as to the structure, but this would avoid having to code a whole new set of "check" resources which seems like a lot more code than just modifying every existing data resource. And being a developer, my refactor red flags go off when I hear that a change would require so much repetitive code, such that maybe the change would be to include putting some of the handling in re-usable function(s) so that future requests may not be as invasive.

ajbrown commented 6 years ago

Here's my usecase. I have a module which establishes a peering connection with a VPC which provides some shared services to our organization. I need to check for route tables on a VPC that would have been created outside of the module, and then add some routes to the ones that are found.

Because data aws_route_table requires that exactly one route table match, I have to do it in two data blocks:

1.) grab the main route table using filters 2.) look for additional route tables by subnet association, one lookup per subnet id (which I load using data aws_subnet_ids

The problem is I don't know if those route tables even exist in the second step. In fact, they most likely don't. In that case, the second step will error since no route table will match.

# Load Subnet IDs
data aws_subnet_ids vpc_subnet_ids {
  vpc_id = "${var.vpc_id}"
}

# Load Subnet data
data aws_subnet vpc_subnets {
  count = "${length(data.aws_subnet_ids.vpc_subnet_ids.ids)}"
  id = "${data.aws_subnet_ids.vpc_subnet_ids.ids[count.index]}"
}

# Load Main Route Table
data aws_route_table main {
  vpc_id = "${data.aws_vpc.target.id}"

  filter {
    name = "association.main"
    values = ["true"]
  }
}

# Load Route Table associated with each subnet (main route won't match if not explicitly associated)
data aws_route_table subnet_routes {
  count = "${length(data.aws_subnet_ids.vpc_subnet_ids.ids)}"
  vpc_id = "${data.aws_vpc.target.id}"

  filter {
    name = "association.subnet-id"
    values = ["${data.aws_subnet_ids.vpc_subnet_ids.ids[count.index]}"]
  }
}

# Get a unique set of route table ids
locals {
  affected_route_table_ids = "${distinct(concat(data.aws_route_table.subnet_routes.*.route_table_id, list(data.aws_route_table.main.route_table_id)))}"
}
Spechal commented 6 years ago

Use case: Get the latest revision of an ECS task so an apply doesn't overwrite (rollback) a task.

data "aws_ecs_task_definition" "this" {
  task_definition = "${aws_ecs_task_definition.this.family}"
}
data.aws_ecs_task_definition.this: Failed getting task definition ClientException: Unable to describe task definition.
    status code: 400

If it would return anything but an error, I could use logic to deduce it to be 0 and therefore does not exist and create the task definition.

ntmggr commented 6 years ago

The simplest we can do here is at least not Error if a data filter is not found. Then we can select something different. Is this so hard ?

Use case: Check if a certificate is uploaded in acm or upload and use a self signed using iam... The data filter fails though and i hate to use just local exec for this.

Data Source

data "aws_acm_certificate" "cert" {
  domain   = "${var.name}-${var.env}.${var.domain}"
  types = ["AMAZON_ISSUED"]
  most_recent = true
}

data "aws_iam_server_certificate" "cert" {
  name        = "${var.name}-${var.env}.${var.domain}"
  latest      = true
}

If the acm certificate is missing we have the following error

* module.gateway.data.aws_acm_certificate.cert: 1 error(s) occurred:

So it would be ideal if we can continue and deal with an empty return somewhere else i.e

certificate_arn   = "${data.aws_acm_certificate.cert.arn != "" ? data.aws_acm_certificate.cert.arn : data.aws_iam_server_certificate.cert.arn}"
aaratn commented 6 years ago

+1

Use Case: Find all aws instances for a particular environment and add route 53 dns entries.

data "aws_instances" "all" {
  filter {
    name   = "tag:Environment"
    values = ["${var.tags["Environment"]}"]
  }
}
data "aws_instance" "instance" {
  count = "${length(data.aws_instances.all.ids)}"
  filter {
    name   = "private-ip-address"
    values = ["${data.aws_instances.all.private_ips[count.index]}"]
  }
}
module "route53_record" {
  source  = "../modules/route53_record"
  zone_id = "${module.route_53_zone_private.id}"
  name    = "${data.aws_instance.instance.*.tags.Name}"
  type = "A"
  records = "${data.aws_instance.instance.*.private_ip}"
}
nunofernandes commented 6 years ago

Just asking if this will possible to have in the next 0.12 version?

apparentlymart commented 6 years ago

This will not be in Terraform 0.12... it needs considerably more design, prototyping, and thought before we could move forward with it, since it's not clear that the sort of highly-dynamic behavior people want to achieve here will fit in well with other Terraform features. In particular, lots of requesters here seem to want not just the ability for a data source to return the lack of items but also to return possibly multiple items matching a query, which is a pretty major change to how data sources work today and would require the feature to be redesigned.

In the mean time, the current approach is to add "plural" data sources to providers where there is a clear use-case, such as aws_subnet_ids in the AWS provider. This sort of thing will become better in 0.12, since the support for complex list types in the new type system can make it possible to return lists of whole objects rather than just returning the ids as we generally do today. For now, if your use case is a variant of "I want to detect if something exists" or "I want to return all of the objects matching criteria", I'd suggest opening issues with individual providers to discuss your use-case and see if a plural data source can be added in the mean time. The plural data sources will also all serve as examples once we do investigate this problem more deeply.

0.12 will also make it easier to use dependency inversion to get similar results in a different way. We'll be writing a guide on this and some other new patterns that the 0.12 features allow as we get closer to the release.

rdkls commented 5 years ago

if AMI does not exist run packer script to create AMI and check again

Hey @ostmike I need the same Actually I would've thought anyone using e.g. ASGs with their own AMIs would Did you find a workaround? I'm thinking something like:

Replacing data "aws_ami" with custom module to query awscli e.g. https://github.com/matti/terraform-shell-resource

benmenlo commented 5 years ago

bump: I have a similar question as @trilitheus and @charlesarnaudo. When updating an AMI in an existing launch configuration that was created in a previous terraform run, I want to be able to set the "desired" to the same value as the previous ASG so we don't cause an outage when rolling to the new AMI.

ajbrown commented 5 years ago

I understand where @apparentlymart is coming from, but I would ask that returning empty results be considered separately from all of the other use-cases mentioned.

How about an "allow_empty" property?

ghost commented 5 years ago

I would also suggest to add a possibility to set default values when empty result is returned.

neechbear commented 5 years ago

+1 bump from another user with the same use case as @earzur https://github.com/hashicorp/terraform/issues/16380#issuecomment-375386696

i was running into the same kind of issue, start a RDS instance from a previously created final snapshot if it exists...

ohuez commented 5 years ago

Bump. Would be very helpful

soumitmishra commented 5 years ago

+1

mildwonkey commented 5 years ago

Hi all, Please do not post "+1" comments here, since it creates noise for others watching the issue and ultimately doesn't influence our prioritization because we can't actually report on these. Instead, react to the original issue comment with 👍, which we can and do report on during prioritization.

castaples commented 5 years ago

0.12 will also make it easier to use dependency inversion to get similar results in a different way. We'll be writing a guide on this and some other new patterns that the 0.12 features allow as we get closer to the release.

@apparentlymart do you have any links to examples of the above now that 0.12 has been released?

emmaLP commented 5 years ago

Another use case:

When creating an ebs volume, I was to lookup if there is a snapshot available if not create the ebs volume without a snapshot otherwise create with a snapshot.

Having the ability to create from a snapshot when available is quite critical to our infrastructure as initially we won't have a snapshot but then we have a dlm that will snapshot daily so if something goes wrong we would like to just run terraform and the ebs will get the latest snapshot.

tdmalone commented 5 years ago

@emmaLP I just came across https://registry.terraform.io/modules/connect-group/lambda-exec in #2756 which looks like a novel, albeit slightly complicated, way to solve what you’re trying to do. One of the suggested use cases is close to what you’re after as well.

flmmartins commented 5 years ago

Hello All,

So I'm having this issue

I did terraform destroy.

Afterwards I ran terraform plan hopping everything would be up again.... but no. After everything got destroyed (and there was NOTHING of this data in state) still was giving me the error:

data.aws_lb.load_balancer: data.aws_lb.load_balancer: Search returned 0 results, please revise so only one is returned

My code:

resource "null_resource" "lb-create" {
  triggers {
    file = "${data.template_file.lb-manifesto.rendered}"
  }

  provisioner "local-exec" {
    command = "kubectl apply -f ${local.lb_path} --context=${var.env_fqdn} && sleep 180"
  }
}

data "aws_lb" "load_balancer" {
  tags = "${local.lb_tags}"
}

output "load_balancer" {
  depends_on = ["null_resource.lb-create"]
  value = {
    "is_internal" = "${data.aws_lb.load_balancer.internal}"
    "name" = "${data.aws_lb.load_balancer.name}"
    "zone_id" = "${data.aws_lb.load_balancer.zone_id}"
    "dns_name" = "${data.aws_lb.load_balancer.dns_name}"
    "id" = "${data.aws_lb.load_balancer.id}"
    "sec-group" = "${aws_security_group.lb-sec-group.id}"
  }
}

....
# Run other Route53 resources relying on that data..

So what I had to do to fix this is to add a depends_on into "aws_lb"."load_balancer" and change syntax in outputs so this could ran with empty which is a bad practice because it causes DNS records to be recreated on every terraform run. Check https://github.com/terraform-providers/terraform-provider-aws/issues/8541

After that this route53 records.... started to give errors because they couldn't use DATA so then I tried to add counts and depends_on over the records but it was getting messy... then I had to put the route53 stuff in another module getting from state... quite messy!

Please fix this...

teamterraform commented 5 years ago

Hi folks, Sorry for the long silence on this. Now that terraform 0.12 is released, we are addressing older issues.

The sort of dynamic decision-making that is being requested here runs counter to Terraform's design goals, since it makes the configuration a description of what might possibly be rather than what is. We are not going to change the behavior of data sources when the response is empty or 404, and will close this issue.

We feel that there are better patterns to follow when managing the differences between environments: composing different modules in different ways, rather than trying to use the same modules in all environments but make them take different actions based on data source results.

Please take a look at the module composition documentation - specifically, the section that describes conditional creation of objects - for a full description.

I realize that this will be an unsatisfying response to some. I encourage you to visit the community forum if you have any questions about restructuring your configuration to achieve the results you need without this feature.

Thank you!