Closed yulifelimited closed 11 months ago
Thanks for opening this issue! Please make sure you've followed our guidelines when opening the issue. In short, to help us reproduce the issue we need:
The ticket INTMDB-1216 was created for internal tracking.
HI @yulifelimited thanks for reporting this to us. Is this happening only when you try to change the analytics node? Could you provide a more exhaustive information on what changes within replication_specs trigger this behaviour?
Also any debug outputs would be appreciated in this case.
We want to understand if the issue is within the Terraform script or it's actually the expected behaviour since replication specs might not be edited but only recreated depending on what param you are specifying.
Any changes in the replication_specs
show as a re-creation. We've recently proceeded to update 10+ clusters to the new replication_specs
attributes (instead of using deprecated attributes such as replication_factor
).
Change to zone_name
# mongodbatlas_cluster.this will be updated in-place
~ resource "mongodbatlas_cluster" "this" {
# (33 unchanged attributes hidden)
- replication_specs {
- id = "5d14ad26553855c8b92557ad" -> null
- num_shards = 1 -> null
- zone_name = "ZoneName managed by Terraform" -> null
- regions_config {
- analytics_nodes = 1 -> null
- electable_nodes = 3 -> null
- priority = 7 -> null
- read_only_nodes = 0 -> null
- region_name = "EU_WEST_2" -> null
}
}
+ replication_specs {
+ id = (known after apply)
+ num_shards = 1
+ zone_name = "test"
+ regions_config {
+ analytics_nodes = 1
+ electable_nodes = 3
+ priority = 7
+ read_only_nodes = 0
+ region_name = "EU_WEST_2"
}
}
# (2 unchanged blocks hidden)
}
Change to electable_nodes
# mongodbatlas_cluster.this will be updated in-place
~ resource "mongodbatlas_cluster" "this" {
# (33 unchanged attributes hidden)
- replication_specs {
- id = "5d14ad26553855c8b92557ad" -> null
- num_shards = 1 -> null
- zone_name = "ZoneName managed by Terraform" -> null
- regions_config {
- analytics_nodes = 1 -> null
- electable_nodes = 3 -> null
- priority = 7 -> null
- read_only_nodes = 0 -> null
- region_name = "EU_WEST_2" -> null
}
}
+ replication_specs {
+ id = (known after apply)
+ num_shards = 1
+ zone_name = "ZoneName managed by Terraform"
+ regions_config {
+ analytics_nodes = 1
+ electable_nodes = 5
+ priority = 7
+ read_only_nodes = 0
+ region_name = "EU_WEST_2"
}
}
# (2 unchanged blocks hidden)
}
There is an id
marked as (known after apply)
but it doesn't actually change once you apply. This is the diff of the state before and after (IAM identity details are redacted such as the role arn
and user_id
).
❯ autogrunt state pull > after.json
❯ diff before.json after.json
6c6
< "serial": 12,
---
> "serial": 13,
326c326
< "zone_name": "ZoneName managed by Terraform"
---
> "zone_name": "test"
334c334
< "next_snapshot": "2023-10-19T11:04:19Z",
---
> "next_snapshot": "2023-10-20T23:04:51Z",
There is a Gist with a bare terraform module to deploy a M10 cluster to get debug logs: https://gist.github.com/yulifelimited/4c86f3a8b588feebc468cca750457a33#file-debug-log
I only tried to change the replication_specs
zone_name
Thanks
After some investigation @yulifelimited, I think this is related to the fact that replication_specs
is defined as schema.TypeSet
. You can see further context at https://github.com/hashicorp/terraform/issues/21901 and https://github.com/hashicorp/terraform/issues/10520
I am going to look at our https://github.com/mongodb/terraform-provider-mongodbatlas/blob/master/CHANGELOG.md because I am spotting some previous cases.
My answer right now is that I don't think we plan to change it. The only way we could change it is by moving to a TypeList but I don't think we want that. This TypeSet behaviour is something controlled by Terraform itself.
Question: what is this practically implying? I re-produced the scenario (btw thanks for the detailed description) and in the end the update works as expected.
@marcosuma thanks for looking into it in details. The problem is the noise this generates when reviewing plans, which makes it easy for mistakes to be missed. To give a different perspective, imagine the same changes on a multi-regional cluster, deployed on 4 AWS regions (which was our use case when opening this ticket). We end up with this kind of diff in the plan:
Scenario: adding AF_SOUTH_1
region with a read only node.
# mongodbatlas_cluster.this will be updated in-place
~ resource "mongodbatlas_cluster" "this" {
# (32 unchanged attributes hidden)
- replication_specs {
- id = "62fcaab0e1540a4e43fdadf9" -> null
- num_shards = 1 -> null
- zone_name = "ZoneName managed by Terraform" -> null
- regions_config {
- analytics_nodes = 0 -> null
- electable_nodes = 1 -> null
- priority = 5 -> null
- read_only_nodes = 0 -> null
- region_name = "US_EAST_1" -> null
}
- regions_config {
- analytics_nodes = 0 -> null
- electable_nodes = 1 -> null
- priority = 6 -> null
- read_only_nodes = 0 -> null
- region_name = "AP_NORTHEAST_1" -> null
}
- regions_config {
- analytics_nodes = 1 -> null
- electable_nodes = 1 -> null
- priority = 7 -> null
- read_only_nodes = 0 -> null
- region_name = "EU_WEST_2" -> null
}
}
+ replication_specs {
+ id = (known after apply)
+ num_shards = 1
+ zone_name = "ZoneName managed by Terraform"
+ regions_config {
+ analytics_nodes = 0
+ electable_nodes = 1
+ priority = 5
+ read_only_nodes = 0
+ region_name = "US_EAST_1"
}
+ regions_config {
+ analytics_nodes = 0
+ electable_nodes = 1
+ priority = 6
+ read_only_nodes = 0
+ region_name = "AP_NORTHEAST_1"
}
+ regions_config {
+ analytics_nodes = 0
+ electable_nodes = 1
+ priority = 7
+ read_only_nodes = 0
+ region_name = "EU_WEST_2"
}
+ regions_config {
+ analytics_nodes = 0
+ electable_nodes = (known after apply)
+ priority = 4
+ read_only_nodes = 1
+ region_name = "AF_SOUTH_1"
}
}
# (2 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
What you may not see above is that there was an analytics_node
in EU_WEST_2
that was going to be removed because the Data engineering team added it on the console (this happened to us!) without telling us, which is easy to miss in the plan because of the way it is displayed. With a nested diff, this problem would have been spotted in seconds.
More region, more noise, to the point it may not fit on a single monitor depending on how many nodes are in your cluster.
This is a serious problem when you consider the risks associated with unexpected changes to a MongoDB cluster - this can end up with service affecting changes that may take a long time to recover from since we would first have to wait for mongodb.com to finish applying the first set of changes before being able to re-configure the cluster again.
@yulifelimited let me discuss with the team if we have other options but as I mentioned it doesn't really depend on us.
Would it be worth it for you to raise an issue to Terraform? Maybe they can further address this situation.
As I mentioned there is not really much we can do here. If we were to switch (back) to schema.TypeList we would end up with a worse situation because order mattered in that case so you would find yourself with undesired changes just because perhaps the Atlas Admin API returned the object in a different order.
@yulifelimited I created an internal issue to track of this work: https://jira.mongodb.org/browse/INTMDB-1234
We decided we should migrate from TypeSet to TypeList similarly to what we did for mongodbatlas_advanced_cluster
. The consequence of this is that:
Closing this issue for now as this is not a bug but a FR and we'll track this internally.
Lastly I want to thank you for reporting this to us and help us improving our code base. Let us know if you have other questions.
Thanks Marco! It is appreciated you are able to sort this out.
If mongodbatlas_advanced_cluster
is already using this type internally, I should be able to migrate to using this resource and benefit from detailed diff?
@yulifelimited yes you can indeed migrate it to the _advanced_
one - it is in fact the best solution for customers, it's just that we haven't still started the deprecation process hence we don't explicitly ask customers to do so. But if you can migrate, please do so.
Hello!
Changes to
replication_specs
inmongodbatlas_cluster
always show the entirereplication_specs
as having to be re-created instead of a diff on individual properties that have changed.Terraform CLI and Terraform MongoDB Atlas Provider Version
We use terragrunt, but that won't change the output.
Terraform Configuration File
Those are excerpts from our Dedicated terraform modules. The exact config doesn't matter so much, you can replicate by making any changes to
replication_specs
with any cluster configuration.main.tf
variables.tf
Steps to Reproduce
Terraform a MongoDB cluster using
replication_specs
. Make a change to any of thereplication_specs
nested values, and the entirereplication_specs
will show as being re-created in the terraform plan. This is further emphasised the more complex configuration you use in thereplication_specs
(such as a multi-regional cluster with RO and/or Analytics nodes)The changes below remove an
analytics_node
from the cluster:variables
Expected Behavior
Deep diff highlights the exact changes to the
analytics_nodes
. Quick and easy review (region_name
should always be displayed to identify the relevant region)plan
Actual Behavior
Entire
replication_specs
show as being re-created.plan