Closed OJFord closed 2 years ago
Hi @OJFord would you mind providing more concrete example with real resources that would help us reproduce the unexpected behaviour you described?
Thanks.
@radeksimko please check issue referenced #6613. This is pretty important and can be hit in other places as well. From my experiments, I observed that "depends on" is only related to order. It does not trigger a change.
Hi @OJFord and @cemo,
In Terraform's design, a dependency edge (which is what depends_on
creates explicitly) is used only for ordering operations. So in the very theoretical example given in the issue summary, Terraform knows that when it's doing any operation that affects both foo.bar
and bar.foo
it will always do the operation to foo.bar
first.
I think you are expecting an additional behavior: if there is an update to foo.bar
then there will always be an automatic update to bar.foo
. But that is not actually how Terraform works, by design: the dependency edges are used for ordering, but the direct attribute values are used for diffing.
So in practice this means that the bar.foo
in the original example will only get an "update" diff if any of its own attributes are changed. To @radeksimko's point it's hard to give a good example without a real use-case, but the way this would be done is to interpolate some attribute of bar.foo
into foo.bar
such that an update diff will be created whenever that attribute changes. Note that it's always attribute-oriented... you need to interpolate the specific value that will be changing.
In practice this behavior does cause some trouble on edge cases, and those edge cases are what #4846 and #8769 are about: allowing Terraform to detect the side effects of a given update, such as the version_id
on an Amazon S3 object implicitly changing each time its content is updated.
Regarding your connection to that other issue @cemo, you are right that the given issue is another one of these edge cases, though a slightly different one: taking an action (deploying) directly in response to another action (updating some other resource), rather than using attribute-based diffing... though for this API gateway case in particular, since API gateway encourages you to create a lot of resources, the specific syntax proposed there would likely be inconvenient/noisy.
Again as @radeksimko said a specific example from @OJFord might allow us to suggest a workaround for a specific case today, in spite of the core mechanisms I've described above. In several cases we have made special allowances in the design of a resource such that a use-case can be met, and we may be able to either suggest an already-existing one of these to use or design a new "allowance" if we have a specific example to work with. (@cemo's API gateway example is already noted, and there were already discussions about that which I will describe in more detail over there.)
I'm sorry that I never came back with an example; I'm afraid I can't remember exactly what I was doing - but:
I think you are expecting an additional behavior: if there is an update to foo.bar then there will always be an automatic update to bar.foo. But that is not actually how Terraform works, by design: the dependency edges are used for ordering, but the direct attribute values are used for diffing.
is exactly right, that was what I misunderstood.
Perhaps something like taint_on_dependency_change = true
is possible? That is, if such a variable is true, change the semantic of "ordering" above from "do this after, if it needs to be done" to "do this after".
@OJFord the issue you don't remember might be #6613.
I second @OJFord's proposition and expect something like a simpler thing as taint_on_dependency_change
. However I can not be considered an expert on terraform land and due to the fact that this is my first experiment with it my opinions might not weight enough.
This taint_on_dependency_change
idea is an interesting one. I'm not sure I would actually implement it using the tainting mechanism, since that's more of a workflow management thing and indicates that the resource is "broken" in some way, but we could potentially think of it more like replace_on_dependency_change
: artificially produce a "force new" diff any time a dependency changes.
I think this sort of thing would likely require some of the machinery from #6810 around detecting the presence of whole-resource diffs and correctly handling errors with them. There are some edge cases round what happens if B depends on A and A is changed but B encounters an error while replacing... since the intended change is not explicitly visible in the attributes, Terraform needs to make sure to do enough book-keeping that it knows it has more work to do when run again after the error is resolved.
It might work out conceptually simpler to generalize the triggers
idea from null_resource
or keepers
from the random
provider, so that it can be used on any resource:
resource "foo" "bar" {
foobar = "${file("foobar")}"
}
resource "bar" "foo" {
lifecycle {
replace_on_change {
foo_bar_foobar = "${foo.bar.foobar}"
}
}
}
In the above example, the lifecycle.replace_on_change
attribute acts as if it were a resource attribute with "forces new resource" set on it: the arbitrary members of this map are stored in the state, and on each run Terraform will diff what's in the state with what's in the config and generate a "replace" diff if any of them have changed.
This effectively gives you an extra place to represent explicit value dependencies that don't have an obvious home in the resource's own attributes.
This is conceptually simpler because it can build on existing mechanisms and UX to some extent. For example, it might look like this in a diff:
-/+ bar.foo
lifecycle.replace_on_change.foo_bar_foobar: "old_value" => "new value" (forces new resource)
In the short term we're likely to continue addressing this by adding special extra ForceNew
attributes to resources where such behavior is useful, so that this technique can be used in a less-generic way where it's most valuable. This was what I'd proposed over in #6613, and has the advantage that it can be implemented entirely within a provider without requiring any core changes, and so there's much less friction to get it done. Thus having additional concrete use-cases would be helpful, either to motivate the implementation of a generic feature like above or to prompt the implementation of resource-specific solutions where appropriate.
For the moment I'm going to re-tag this one as "thinking" to indicate that it's an interesting idea but we need to gather more data (real use-cases) in order to design it well. I'd encourage other folks to share concrete use-cases they have in this area as separate issues, similar to what's seen in #6613, and mention this issue by number so that it can become a collection of links to relevant use-cases that can inform further design.
@mitchellh This issue might be considered for 0.8 release as you improved "depends_on" and this might be a quick win.
resource "aws_appautoscaling_target" "ecs_target" {
max_capacity = "${var.max_capacity}"
min_capacity = "${var.min_capacity}"
role_arn = "${var.global_vars["ecs_as_arn"]}"
resource_id = "service/${var.global_vars["ecs_cluster_name"]}/${var.ecs_service_name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "ecs_cpu_scale_in" {
adjustment_type = "${var.adjustment_type}"
cooldown = "${var.cooldown}"
metric_aggregation_type = "${var.metric_aggregation_type}"
name = "${var.global_vars["ecs_cluster_name"]}-${var.ecs_service_name}-cpu-scale-in"
resource_id = "service/${var.global_vars["ecs_cluster_name"]}/${var.ecs_service_name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
step_adjustment {
metric_interval_upper_bound = "${var.scale_in_cpu_upper_bound}"
scaling_adjustment = "${var.scale_in_adjustment}"
}
depends_on = ["aws_appautoscaling_target.ecs_target"]
}
Hi @apparentlymart,
Here is another real user-case of my own.
resource aws_appautoscaling_policy.ecs_cpu_scale_in
(let it be autoscaling policy) depends on resource aws_appautoscaling_target.ecs_target
(let it be autoscaling target).
When I change the value of max_capacity
, and then run terraform plan
, it shows the autoscaling target is forced to new (it is going to be destroyed and re-added). But nothing will happen to autoscaling policy, which is supposed to be destroyed and re-added as well.
Why is it supposed to? Because in my practice, after terraform apply
successfully (which destroys and re-adds the autoscaling target successfully), the autoscaling policy is gone automatically (if you login to aws console, you can see it's gone), so I have to run terraform apply
again, the second time, and this time, it will add the autoscaling policy back.
(BTW. the both resources are actually defined in a module. maybe it matters, or not, I'm not sure).
Hi @ckyoog! Thanks for sharing that.
What you described there sounds like what's captured in terraform-providers/terraform-provider-aws#240. If you think it's the same thing, it would be cool if you could post the same details in that issue since having a full reproduction case is very useful. I think in your particular case this is a bug that we ought to fix in the AWS provider, though you're right that if the feature I described in my earlier comment were implemented it could in principle be used as a workaround.
In the mean time, you might be able to already workaround this by including an additional interpolation in your policy name to force it to get recreated when the target is recreated:
name = "${var.global_vars["ecs_cluster_name"]}-${var.ecs_service_name}-cpu-scale-in-${aws_appautoscaling_target.ecs_target.id}"
Since the name attribute forces new resource, this should cause the policy to get recreated each time the target is recreated.
Thank you @apparentlymart for the workaround. Sure, I will post my case to issue terraform-provider/terraform-provider-aws#240.
Hey, just got an idea of how this might be solutionned, now the approach is inspired fron Google Cloud and I don't know if it will apply to all use cases. Basically, in google cloud you have the notion of used by and uses on resources. For example, the link between a boot_disk and an instance. The boot_disk can exist alone as a simple disk but the instance cannot exist without a boot disk. Therefore, in the data model, you can have a generic system that states, used_by.
Example:
resource "google_compute_disk" "bastion_boot" = {
image = "centos-7"
size = "10"
used_by = ["${google_compute_instance.bastion.name}"]
}
resource "google_compute_instance" "bastion" = {
boot_disk = {
source = "${google_compute_disk.bastion_boot.name}"
}
uses = ["${google_compute_disk.bastion_boot.name}"]
}
The uses
and used_by
could be implicitly set in well known cases but could be explicitly set in some user and/or corner cases. And it would become the provider's responsibility to know about the implicit uses
and as a workaround, it would be possible to use the explicit form.
It would work like the implicit and explicit depends_on
except.
Now I understand that there are some subtle differences in the problems that have been mentioned like, I don't want to destroy, I want to update a resource for example. I don't know how my case would fit into this.
Also, I think it would be best to stick with the cloud provider's semantics, and, in my case, it really reflects what I'm doing and how everything works. This system would be a reverse depends on
creating a possible destruction cycle that would be triggered before the create cycle. Which would be fine in most cases, and if you cannot tolerate a destruction, you usually apply a blue-green model anyways which doesn't give such a pain. But in my case, during my maintenance windows, I can be destructive on most of my resources.
Just some related issues:
I have run into the need for this issue myself.
The use case is the following:
I have a resource for a database instance (In this case an AWS RDS instance) which performs a snapshot of its disk upon destruction. If I destroy this resource and recreate it and destroy it again, AWS returns an error because it will attempt to create a snapshot with the same identifier as before.
This can be mitigated by using something like the "random_id" resource as a suffix/prefix to that identifier. The issue is that if I taint the database resource, I need to manually remember to taint the "random_id" resource as well otherwise the new instance will have the same "random_id" as before.
Attempting to use a "keepers" pointing to the database resource id does not work because it causes a cyclic dependency.
Any ideas on how one handles that?
I've run into this same issue with trying to get an EMR cluster to rebuild when the contents of a bootstrap operation change. See https://stackoverflow.com/questions/53887061/in-terraform-how-to-recreate-emr-resource-when-dependency-changes for details.
My concrete case is similar to the ones already discussed. I want to destroy, recreate a disk resource when a VM startup-script changes. I really like the described lifecycle.replace_on_change
solution, but I wonder if it it would work for me. My VM already has a reference to the disk, and the disk would grow a replace-on-change reference to the VM's startup-script. Would that cycle be a problem? I can represent the startup script as a separate template-file resource pretty easily, but cycles caused by replace-on-change should either work well or be an error.
The other solution that I thought of looks like so:
resource my-vm { startup-script }
resource my-disk {}
resource null-resource _ {
triggers {
startup-script = my-vm.startup-script
}
provisioner "taint" {
resources = ["my-disk"]
}
}
But I don't know how this would look in tf-plan. You're right that the lifecycle.replace-on-change design leverages some existing patterns nicely.
I frequently run into this problem if I'm using kubernetes provider resources that depend on a module that creates the gke or eks cluster. If a configuration change is made that causes the k8s cluster to be destroyed/recreated, obviously all kubernetes resources are lost.
I am running into this problem as well. I have two resources. One resource depends on the other. If I delete the dependent resource outside of terraform, I need BOTH resources to be recreated but terraform does not know that, it only offers to create the resource that I manually deleted.
Its a chicken and egg issue when an outside force modifies the infrastructure.
Another example: rotating an EC2 keypair that is configured on an Elastic Beanstalk environment should trigger a rebuild of the environment.
resource "aws_elastic_beanstalk_environment" "test"
...
setting {
namespace = "aws:autoscaling:launchconfiguration"
name = "EC2KeyName"
value = "${aws_key_pair.test.key_name}"
}
}
Here's another example with Amazon Lightsail: if you recreate a aws_lightsail_instance
you will need to recreate the aws_lightsail_static_ip_attachment
between the aws_lightsail_instance
and the aws_lightsail_static_ip
resource "aws_lightsail_instance" "instance_1" {
name = "Instance 1"
# ...
}
# yes, the below is really all that is needed for the aws_lightsail_static_ip resource
resource "aws_lightsail_static_ip" "instance_1_static_ip" {
name = "Instance 1 Static IP"
}
resource "aws_lightsail_static_ip_attachment" "instance_1_static_ip_attachment" {
static_ip_name = "${aws_lightsail_static_ip.instance_1_static_ip.name}"
instance_name = "${aws_lightsail_instance.instance_1.name}"
}
In this example, if you run terraform taint aws_lightsail_instance.instance_1
running terraform apply
will recreate the aws_lightsail_instance
resource, but then the aws_lightsail_static_ip_attachment
resource will be automatically detached. You'll have to run terraform apply
again to realize it has changed.
Adding another use case related to this request.
I have a custom Provider which defines a "workflow_execution" resource. When created, it triggers an application deployment. I would like to have the "workflow_execution" created:
For the first point to be achieved the "workflow_execution" resource creation has to be dependent on a change in another resource, which is currently not supported by Terraform.
Adding another use case related to this request.
I use AVI LB and create GSLB for all the services that we use. Right now the connection between AVI GSLB and the web apps are done through uuid only. when any attribute on the web app changes the uuid gets regenerated and results in out of sync with AVI GSLB.
Need a solution to recreate GSLB everytime there is a change done to web app.
If one needs to recreate an aws_lb_target_group
that is currently the target of an aws_lb_listener_rule
, the aws_lb_listener_rule
needs to first be destroyed before the aws_lb_target_group
can be recreated.
Piling on, this would be extremely useful for redeploying APIs via the AWS provider.
e.g., aws_api_gateway_deployment
resource handles the deployment of an AWS API Gateway instance. However it must be manually redeployed if any API methods, resources, or integrations change.
A workaround might be setting the stage name of the deployment to the hash of the directory containing the volatile configurations, but the end result would be many stages.
edit - Naturally, it looks like there's already been a few issues created regarding this.
I was running into the same issue with kubernetes as you @jhoblitt . I managed to find a workaround in the fact that (it seems) all kubernetes resources require that the name doesn't change. If you change the name, the resource will be recreated.
So I created a random id that is based on the cluster endpoint and I append that to the name of all my kubernetes resources.
// Generate a random id we can use to recreate k8s resources
resource "random_id" "cluster" {
keepers = {
// Normally a new cluster will generate a new endpoint
endpoint = google_container_cluster.cluster.endpoint
}
byte_length = 4
}
resource "kubernetes_deployment" "tool" {
metadata {
name = "tool-deployment-${random_id.cluster.hex}"
labels = {
App = "tool"
}
}
spec {
...
}
}
It's not ideal (especially for naming services) but it works for me. The only issue I still have is with helm which I use to install traefik. If I add the id to those names creation works fine but on update of the id I get a cyclic dependency problem. Also the change in the name of the service account roles makes helm / tiller not work properly anymore, so I'll probably completely forgo helm and configure traefik manually.
@radeksimko
would you mind providing more concrete example with real resources that would help us reproduce the unexpected behaviour you described?
resource "kubernetes_config_map" "config" {
data = {
FOO = "bar"
}
metadata {
name = "config"
}
}
resource "kubernetes_deployment" "deployment" {
depends_on = [ kubernetes_config_map.config ]
metadata {
name = "deployment"
}
spec {
env_from {
config_map_ref {
name = kubernetes_config_map.config.metadata[0].name
}
}
}
}
i want my k8s deployment to get patched every time i terraform apply
a config change, for example, changing the env var FOO
to baz
. that's my use case
Another example. Replacing an aws_key_pair does not update related aws_instance(s)
If one needs to recreate an
aws_lb_target_group
that is currently the target of anaws_lb_listener_rule
, theaws_lb_listener_rule
needs to first be destroyed before theaws_lb_target_group
can be recreated.
That's similar to what I'm bumping into and trying to work around right now ... trying to evaluate a solution and a "force_recreate/taint" in lifecycle, or similar, would be incredibly useful right now ...
In my case I have a target group that needs to be recreated, but the listener (no rule involved here) is only getting a "update in place" change ... but then the target group cannot be destroyed because the listener isn't being destroyed ...
For reference for others searching the issue for this in the AWS provider is being tracked in terraform-providers/terraform-provider-aws#10233
I was running into the same issue with the Google Provider and the resource google_compute_resource_policy & google_compute_disk_resource_policy_attachment.
When you create a policy for scheduling the snapshots of a GCE Disk you must attach the policy to the disk. That policy isn't editable so if you perform any changes Terraform has to recreate the resource but doesn't recreate the attachment resource, even if it's "linked" with the _dependson directive of Terraform.
resource "google_compute_resource_policy" "snapshot_schedule_wds" {
name = "snapshot-weekly-schedule-wds"
region = var.subnetwork_region
project = google_project.mm-sap-prod.name
snapshot_schedule_policy {
schedule {
weekly_schedule {
day_of_weeks {
day = "SATURDAY"
start_time = "20:00"
}
}
}
retention_policy {
max_retention_days = 366
on_source_disk_delete = "KEEP_AUTO_SNAPSHOTS"
}
snapshot_properties {
labels = {
app = "xxx"
}
storage_locations = ["europe-west6"]
guest_flush = false
}
}
}
resource "google_compute_disk_resource_policy_attachment" "gcp_wds_snap_schedule_pd_boot" {
name = google_compute_resource_policy.snapshot_schedule_wds.name
disk = google_compute_disk.web-dispatch-boot.name
zone = var.zone
project = google_project.mm-sap-prod.name
depends_on = ["google_compute_resource_policy.snapshot_schedule_wds"]
}
Terraform v0.12.13
+ provider.external v1.2.0
+ provider.google v2.20.0
+ provider.google-beta v2.20.0
Any solution for this use case?
@psanzm in this very specific use case, using the google_compute_resource_policy
's id
field, instead of name
, in the google_compute_disk_resource_policy_attachment
's name
field allows to it work:
resource "google_compute_disk_resource_policy_attachment" "gcp_wds_snap_schedule_pd_boot" {
name = google_compute_resource_policy.snapshot_schedule_wds.id
...
Note: it works because the actual values of name
and id
are the same, but the id
is unknown upon recreation.
To add another example use case I recently ran into with Azure PostgreSQL. I wanted to upgrade the version of the PostgreSQL engine on the server, which requires replacement. The dependent resources such as firewall rules and Postgres configurations were not re-created. I had to run through two applies. This is a common occurrence in Azure where most IDs are based on the name of the resource, so if it is re-created the ID stays the same and dependent resources don't register the change.
resource "azurerm_postgresql_server" "pgsql_server" {
name = "examplepgsql"
resource_group_name = "my-rg"
location = "eastus"
sku {
name = "GP_Gen5_2"
capacity = "2"
tier = "GeneralPurpose"
family = "Gen5"
}
storage_profile {
storage_mb = "51200"
backup_retention_days = 35
geo_redundant_backup = "Enabled"
}
administrator_login = var.admin_username
administrator_login_password = var.admin_password
version = "11"
ssl_enforcement = "Enabled"
}
resource "azurerm_postgresql_firewall_rule" "azure_services_firewall_rule" {
name = "AzureServices"
resource_group_name = azurerm_postgresql_server.pgsql_server.resource_group_name
server_name = azurerm_postgresql_server.pgsql_server.name
start_ip_address = "0.0.0.0"
end_ip_address = "0.0.0.0"
}
resource "azurerm_postgresql_configuration" "log_checkpoints_pgsql_config" {
name = "log_checkpoints"
resource_group_name = azurerm_postgresql_server.pgsql_server.resource_group_name
server_name = azurerm_postgresql_server.pgsql_server.name
value = "on"
}
Another use case :
I wanted to update an SSM parameter with the value of a AMI data block, but only when it changes.
This is for use with an Automation workflow like the example posted in the AWS docs.
My thought was : put in a null_resource
that triggers when the AMI ID changes, and make the SSM parameter depend on this, but all null_resource
emits is an ID.
Aha, I thought, I'll do this :
data "aws_ami" "windows" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["Windows_Server-2012-R2_RTM-English-64Bit-Base-*"]
}
}
resource "null_resource" "new_windows_ami" {
triggers = {
base_ami_date = data.aws_ami.windows.creation_date
force_update = 1
}
}
resource "aws_ssm_parameter" "current_windows_ami" {
name = "/ami/windows/2k12/current"
value = data.aws_ami.windows.image_id
type = "String"
tags = {
BaseAmiTriggerId = null_resource.new_windows_ami.id
}
depends_on = [
null_resource.new_windows_ami,
]
# We only want the initial value from the data, we're going to replace this
# parameter with the current "patched" release until there's a new base AMI
overwrite = true
lifecycle {
ignore_changes = [
value,
]
}
}
... sadly ignore_change
also implies block changes. What I was hoping was that the change to the tag would be enough to trigger an update of the whole resource. ignore_changes
means that changes to the inputs of the attributes are ignored for all purposes, not just whether they trigger a lifecycle update.
This seems a shame because otherwise you could implement quite sophisticated lifecycle management with the null resource, concocting triggers with interpolations and such and only triggering an update to a dependent resource when the ID changed as a result.
I came to this thread from https://github.com/terraform-providers/terraform-provider-azurerm/issues/763. I do not know how this is connected, but that issue was closed for the sake of https://github.com/terraform-providers/terraform-provider-azurerm/issues/326, which in turn was closed for the sake of this one.
So, if you guys understand how the connection was made, then here is another scenario and very real. We modify probing path on an azure traffic manager and boom - its endpoints are gone. This is very frustrating. Is there an ETA on the fix for this issue?
@MarkKharitonov This issue is essentially a feature request, what you're describing with Azure sounds like a bug though (but I haven't used Azure or read through those issues) - so perhaps the link is 'sorry, nothing we can do without [this issue resolved], closing'.
I phrased it as a bug in the OP (and I should perhaps edit that) out of misunderstanding, but it's really a request for a form of dependency control that isn't possible (solely) with terraform today.
I do not understand. I have a traffic manager resource. The change does not recreate the resource - it is reported as an in-place replacement. Yet it blows away the endpoints. How come it is a feature request?
@MarkKharitonov As I said, "what you're describing with Azure sounds like a bug", but this issue is a feature request, for something that does not exist in terraform core today.
Possibly the Azure resolution was 'nothing we can do without a way of doing [what is described here]' - I have no idea - but this issue itself isn't a bug, and is labelled 'thinking'. There's no guarantee there'll ever be a way of doing this, nevermind an ETA.
(I don't work for Hashicorp, I just opened this issue, there could be firmer internal plans for all I know, just trying to help.)
I do not know what to do. There are real issues in the provider that are being closed claiming the issues is because of this one. But this one is something apparently huge in scope. So, I do not understand what am I supposed to do. Should I open yet another issue in terraform providers making reference to the already closed ones and to this one? How do we attract attention to the real bug without it being closed for nothing, which has already happened twice?
@MarkKharitonov I'm not expert on Terraform or Terraform provider development so someone else please correct me if I'm wrong but I don't think there's anything that can be done in the provider. The issues in the Azure provider are caused by a limitation of Terraform, not a bug in the AzureRM provider that can be fixed. Based on the comments in this issue, there is a fundamental challenge with how the Azure API works and how Terraform handles dependencies. Azure's API does not have unique IDs for resources. So if you have a child resource that references a parent resource by ID, even if that parent resource is re-created the ID doesn't change. From Terraform's perspective, that means that no attribute was changed on the child-resource since the ID it's referencing is the same, even though in actuality the child resource was also destroyed together with the parent resource. The feature request here, as I understand it, is to add additional intelligence to Terraform dependencies to use them not just for ordering resource creation, but also to detect that a dependency (e.g. parent resource) was destroyed/re-created and trigger a destroy/re-create on the dependent resource (e.g. child resource), irrespective of if any attributes on the child resource have changed.
This issue appears really critical and not a feature request at all. The fundamental core of terraform is to make sure to apply any missing changes if required. In this case not having terraform create dependency on a parent resource recreation is fundamentally an issue.
Could someone clarify if the authorization rules not being created when the event hub associated with is re-created has been present a long time ago. Is there any previous version of azureRM or Terraform that would mitigate the issue until this gets resolved?
Because the only approach that I can see to work around this issue is to invoke twice terraform deployment which to me is a non sense.
Hey !
I have another example of this behaviour. Changes done to modules that force recreation of resources inside the module, used by dashboard, won't update and it will result in dashboard referencing configuration from before-apply. Another apply will actually pickup those changes and alter the dashboard_json template. Weird thing is that changes done to aws_instance.cron
will be picked up at the time of the first apply but changes to module will not.
data "template_file" "dashboard_json" {
template = file("${path.module}/templates/cloudwatch_dashboard/dashboard.tpl")
vars = {
rds_instance_id = module.database.rds_instance_id
region = var.aws_region
asg_normal_name = module.autoscaling_group.aws_autoscaling_group_name-normal
cron_instance_id = aws_instance.cron.id
lb_arn_suffix = module.load_balancer.aws_lb_arn_suffix
lb_target_group_arn_suffix = module.load_balancer.aws_lb_target_group_target_group_arn_suffix
lb_blackhole_target_group_arn_suffix = module.load_balancer.aws_lb_target_group_target_group_blackhole_arn_suffix
lb_redash_target_group_arn_suffix = aws_lb_target_group.redash.arn_suffix
procstats_cpu = (length(var.cron_procstats[local.environment]) > 0) ? data.template_file.dashboard_procstats_cpu.rendered : ""
procstats_mem = (length(var.cron_procstats[local.environment]) > 0) ? data.template_file.dashboard_procstats_mem.rendered : ""
# force recreation of the dashboard due to weird behaviour when changes to modules above
# are not picked up by terraform and dashboard is not being updated
force_recreation = var.force_dashboard_recreation[local.environment] ? "${timestamp()}" : ""
}
}
resource "aws_cloudwatch_dashboard" "main" {
dashboard_name = "${var.project_name}-${local.environment}-dashboard"
dashboard_body = data.template_file.dashboard_json.rendered
}
I tried using depends_on - maybe the ordering would help with it - but it didn't help I end up using timestamp to force recreation.
We have the exact same problem on GCP which is described in details in this issue https://github.com/terraform-providers/terraform-provider-google/issues/6376.
Here is part of the relevant config:
resource "google_compute_region_backend_service" "s1" {
name = "s1"
dynamic "backend" {
for_each = google_compute_instance_group.s1
content {
group = backend.value.self_link
}
}
health_checks = [
google_compute_health_check.default.self_link,
]
}
resource "google_compute_health_check" "default" {
name = "s1"
tcp_health_check {
port = "80"
}
}
resource "google_compute_instance_group" "s1" {
count = local.s1_count
name = format("s1-%02d", count.index + 1)
zone = element(local.zones, count.index)
network = data.google_compute_network.network.self_link
}
I'm not sure is this a general TF problem or a Google provider problem, but here it goes.
Currently it's not possible to lover the number of google_compute_instance_group
that are used in a google_compute_region_backend_service
. In the code above if we lower the number of google_compute_instance_group
resources and try to apply the configuration, TF will first try to delete the not needed instance groups and then update the backend configuration, but that order doesn't work because you cannot delete an instance group that is used by the backend service, the order should be the other way around.
So to sum it up, when I lower the number of the instance group resources TF does this:
google_compute_instance_group
-> this failsgoogle_compute_region_backend_service
It should do this the other way around:
google_compute_region_backend_service
google_compute_instance_group
-> this failsWhat I don't understand is why doesn't TF know that it should do the update first, then remove instance groups? When I run destroy, TF does it correctly: first destroys the backend service, then instance groups.
Also this is very hard to fix, because you need to make a temp config change, apply, then set the final config you want and again apply.
@kustodian Can you use create_before_destroy
in google_compute_instance_group
?
resource "google_compute_instance_group" "s1" {
count = local.s1_count
name = format("s1-%02d", count.index + 1)
zone = element(local.zones, count.index)
network = data.google_compute_network.network.self_link
lifecycle {
create_before_destroy = true
}
}
@lorengordon I can, but it doesn't help. TF works exactly the same im my example with or without create_before_destroy = true
.
To be honest I'm not entirely sure that my issue is the same thing as what the issue reporter is describing.
@apparentlymart May I suggest locking this issue? I suspect you and the team probably have enough examples and use cases to consider this feature now?
I could 'unsubscribe' of course, it's just that I would like to be notified if/when there's a decision, some progress, or something to help test. Cheers. :slightly_smiling_face:
Edit: It turns out this is really a function of kubernetes, and not really a terraform concern.
Just adding my 0.02. This is also an issue with the kubernetes provider and secrets/config maps. A service using an updated config map or secret doesn't detect the change because the underlying pods of the service need to be restarted or recreated to detect the changes.
resource "kubernetes_secret" "value" {
metadata {
name = "k8s-secret-value"
namespace = "private"
}
data {
secret = var.secret_value
}
}
resource "kubernetes_deployment" "service" {
metadata {
name = "internal-service"
namespace = "private"
}
spec {
template {
spec {
container {
env {
name = "SECRET_VALUE"
value_from {
secret_key_ref {
name = kubernetes_secret.value.metadata.0.name
key = "secret"
}
}
}
}
}
}
}
}
If the value for the secret
key is updated, nothing seems to happen with the deployment.
I'm going to lock this issue for the time being, because the remaining discussion seems largely to be people supporting each other in workarounds.
I’m happy to see people are helping each other work around this, and I've created a thread for this on the community forum so that people can continue these discussions without creating excess noise for people who just want succinct updates in GitHub.
bar.foo
is not modified if the file 'foobar' changed without otherwise changing the resource that includes it.