mongodb / terraform-provider-mongodbatlas

Terraform MongoDB Atlas Provider: Deploy, update, and manage MongoDB Atlas infrastructure as code through HashiCorp Terraform
https://registry.terraform.io/providers/mongodb/mongodbatlas
Mozilla Public License 2.0
244 stars 172 forks source link

"CLUSTER_DISK_IOPS_INVALID" related error/unexpected update-in-place #439

Closed bnelson729 closed 3 years ago

bnelson729 commented 3 years ago

Terraform CLI and Terraform MongoDB Atlas Provider Version

Terraform version: 0.13.5
Provider version: v0.8.2

Terraform Configuration File

resource "mongodbatlas_cluster" "cluster" {
  project_id = mongodbatlas_project.project.id
  name       = var.project_name

  cluster_type           = "REPLICASET"
  provider_name          = "AWS"
  mongo_db_major_version = var.major_version

  replication_specs {
    num_shards = 1
    dynamic "regions_config" {
      for_each = var.regions_configs

      content {
        region_name     = regions_config.value.region_name
        electable_nodes = regions_config.value.electable_nodes
        priority        = regions_config.value.priority
        read_only_nodes = regions_config.value.read_only_nodes
      }
    }
  }

  provider_backup_enabled = true
  pit_enabled             = var.continuous_backups_enabled

  provider_instance_size_name                     = var.instance_size
  auto_scaling_compute_enabled                    = true
  auto_scaling_compute_scale_down_enabled         = true
  provider_auto_scaling_compute_min_instance_size = var.instance_size_min
  provider_auto_scaling_compute_max_instance_size = var.instance_size_max

  auto_scaling_disk_gb_enabled = true
  disk_size_gb                 = var.disk_size_gb
  provider_volume_type         = "STANDARD"
  provider_encrypt_ebs_volume  = true

  lifecycle {
    ignore_changes = [
      mongo_db_major_version,
      provider_instance_size_name,
      disk_size_gb,
    ]
  }
}

Steps to Reproduce

On 4/21/2021 at 12:00pm EDT, I ran terraform apply without any issue. On 4/21/2021 at 3:05pm EDT, I ran terraform apply without any changes.

Expected Behavior

Re-running terraform apply without any changes shouldn't cause terraform to apply anything.

Actual Behavior

When terraform refreshed the state from Atlas by calling the API, the API must have returned that provider_volume_type was null. It then created a terraform plan and tried to apply a patch setting provider_volume_type from null to STANDARD. I believe something changed on your API that is making terraform think it needs to apply this patch.

The full plan looks like this (I replaced anything identifable with asterisks):

Terraform will perform the following actions:
  # module.atlas_cluster.mongodbatlas_cluster.cluster will be updated in-place
  ~ resource "mongodbatlas_cluster" "cluster" {
        auto_scaling_compute_enabled                    = true
        auto_scaling_compute_scale_down_enabled         = true
        auto_scaling_disk_gb_enabled                    = true
        backup_enabled                                  = false
        bi_connector                                    = {
            "enabled"         = "false"
            "read_preference" = "secondary"
        }
        cluster_id                                      = "60415f23afe5fe656d3190d6"
        cluster_type                                    = "REPLICASET"
        connection_strings                              = [
            {
                aws_private_link     = {}
                aws_private_link_srv = {}
                private              = ""
                private_endpoint     = []
                private_srv          = ""
                standard             = "**********"
                standard_srv         = "**********"
            },
        ]
        container_id                                    = "60415f23afe5fe656d3190d2"
        disk_size_gb                                    = 50
        encryption_at_rest_provider                     = "NONE"
        id                                              = "Y2x1c3Rlcl9pZA==:NjA0MTVmMjNhZmU1ZmU2NTZkMzE5MGQ2-Y2x1c3Rlcl9uYW1l:b25lYXBwLW1vZS1pbnQtbmZ0-cHJvamVjdF9pZA==:NjA0MTUxOWVmNDI0NmIyM2E4NTgyZmQw-cHJvdmlkZXJfbmFtZQ==:QVdT"
        mongo_db_major_version                          = "4.4"
        mongo_db_version                                = "4.4.4"
        mongo_uri                                       = "**********"
        mongo_uri_updated                               = "2021-03-04T22:38:48Z"
        mongo_uri_with_options                          = "**********"
        name                                            = "**********"
        num_shards                                      = 1
        paused                                          = false
        pit_enabled                                     = false
        project_id                                      = "6041519ef4246b23a8582fd0"
        provider_auto_scaling_compute_max_instance_size = "M30"
        provider_auto_scaling_compute_min_instance_size = "M10"
        provider_backup_enabled                         = true
        provider_disk_iops                              = 150
        provider_encrypt_ebs_volume                     = true
        provider_instance_size_name                     = "M10"
        provider_name                                   = "AWS"
      + provider_volume_type                            = "STANDARD"
        replication_factor                              = 0
        snapshot_backup_policy                          = [
            {
                cluster_id               = "**********"
                cluster_name             = "**********"
                next_snapshot            = "2021-04-21T22:38:49Z"
                policies                 = [
                    {
                        id          = "604161782481b8783f0ec98e"
                        policy_item = [
                            {
                                frequency_interval = 6
                                frequency_type     = "hourly"
                                id                 = "604161782481b8783f0ec98f"
                                retention_unit     = "days"
                                retention_value    = 2
                            },
                            {
                                frequency_interval = 1
                                frequency_type     = "daily"
                                id                 = "604161782481b8783f0ec990"
                                retention_unit     = "days"
                                retention_value    = 7
                            },
                            {
                                frequency_interval = 6
                                frequency_type     = "weekly"
                                id                 = "604161782481b8783f0ec991"
                                retention_unit     = "weeks"
                                retention_value    = 4
                            },
                            {
                                frequency_interval = 40
                                frequency_type     = "monthly"
                                id                 = "604161782481b8783f0ec992"
                                retention_unit     = "months"
                                retention_value    = 12
                            },
                        ]
                    },
                ]
                reference_hour_of_day    = 22
                reference_minute_of_hour = 38
                restore_window_days      = 7
                update_snapshots         = false
            },
        ]
        srv_address                                     = "**********"
        state_name                                      = "IDLE"
        advanced_configuration {
            fail_index_key_too_long              = false
            javascript_enabled                   = true
            minimum_enabled_tls_protocol         = "TLS1_2"
            no_table_scan                        = false
            oplog_size_mb                        = 0
            sample_refresh_interval_bi_connector = 0
            sample_size_bi_connector             = 0
        }
        replication_specs {
            id         = "60415f23afe5fe656d3190cd"
            num_shards = 1
            zone_name  = "ZoneName managed by Terraform"
            regions_config {
                analytics_nodes = 0
                electable_nodes = 1
                priority        = 6
                read_only_nodes = 0
                region_name     = "US_WEST_2"
            }
            regions_config {
                analytics_nodes = 0
                electable_nodes = 2
                priority        = 7
                read_only_nodes = 0
                region_name     = "US_EAST_2"
            }
        }
    }

Debug Output

N/A

Crash Output

module.atlas_cluster.mongodbatlas_cluster.cluster: Modifying... [id=Y2x1c3Rlcl9pZA==:NjA0MTVmMjNhZmU1ZmU2NTZkMzE5MGQ2-Y2x1c3Rlcl9uYW1l:b25lYXBwLW1vZS1pbnQtbmZ0-cHJvamVjdF9pZA==:NjA0MTUxOWVmNDI0NmIyM2E4NTgyZmQw-cHJvdmlkZXJfbmFtZQ==:QVdT]

Error: error updating MongoDB Cluster (*******************): PATCH https://cloud.mongodb.com/api/atlas/v1.0/groups/6041519ef4246b23a8582fd0/clusters/*******************: 400 (request "CLUSTER_DISK_IOPS_INVALID") The cluster's disk IOPS of 150 is invalid. For a disk of size 50 on instance size M10 with a volume type of STANDARD, the IOPS must be 3000.0.
  on .terraform/modules/atlas_cluster/cluster.tf line 1, in resource "mongodbatlas_cluster" "cluster":
   1: resource "mongodbatlas_cluster" "cluster" {

Additional Context

No

References

N/A

robbiet480 commented 3 years ago

Also happening to me as of today. Wasn't happening last week.

themantissa commented 3 years ago

@bnelson729 thank you for the issue. So the good thing is the new default IOPS is now 3000k. The error is also expressing that. However you did not set the default iops but you did set provider_volume_type = "STANDARD" which is only needed when changing dealing with IOPS. If you remove that I believe the new IOPS will be updated in your state without error since you do not have it defined in the configuration. If you still get an update in place please update the plan here.

If anyone else has the same issue and is setting provider_disk_iops, change it to: provider_disk_iops = 3000 Or remove it and provider_volume_type = "STANDARD"

I'll update the documentation to reflect this new IOPS default.

Also, @robbiet480 this just launched today so that's why you are seeing this error. Is your configuration specific on iops?

robbiet480 commented 3 years ago

Also, @robbiet480 this just launched today so that's why you are seeing this error. Is your configuration specific on iops?

disk_size_gb                = 110
provider_disk_iops          = 330
provider_volume_type        = "STANDARD"
provider_encrypt_ebs_volume = true
provider_instance_size_name = "M20"

Will bump provider_disk_iops to 3000 and report back. Thanks!

robbiet480 commented 3 years ago

It took 4 minutes, 51 seconds to "modify" but did eventually complete. Still seeing 330 IOPS in the Atlas Console though.

robbiet480 commented 3 years ago

Oopsies, nevermind, was looking at the wrong cluster! Seeing 3000 IOPS now. Just to confirm, there's no pricing change right?

nikhil-mongo commented 3 years ago

@robbiet480 There is no change in price with this change in default IOPS.

nikhil-mongo commented 3 years ago

Also, @robbiet480 are we good to close this?

robbiet480 commented 3 years ago

For me yes, but I didn't open the issue so you may want to wait to see what @bnelson729 says?

pitthecat commented 3 years ago

Same happening for us today. Changing the IOPS to 3000 "fixes" the problem and changes the MongoDB cluster's IOPS from 100 to the new minimum 3000.

Did Altas change from gp2 to gp3 for AWS?

themantissa commented 3 years ago

@nikhil-mongo let's keep this open till we ensure we have got the word out and ensure the original issue is corrected.

pitthecat commented 3 years ago

I changed our M10 and M20 clusters from 100 to 3000 without a problem. But its not working for our M30 cluster. If I want to change from 150 IOPS to 3000 and Terraform seems to be doing it and finishes the apply. But it's still 150 in the GUI and if I ran Terraform again the wants to do 150 -> 3000 again.

module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=2x1xMjg4....., 10s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=2x1xMjg4....., 20s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 30s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 40s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 50s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 1m0s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 1m10s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 1m20s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Still modifying... [id=Y2x1xMjg4....., 1m30s elapsed]
module.mongodb.mongodbatlas_cluster.cluster: Modifications complete after 1m33s [id=Y2x1xMjg4.....]

Apply complete! Resources: 0 added, 2 changed, 0 destroyed.

On the next apply he tries to do it again with the same result. Still 150

   ~ provider_disk_iops                      = 150 -> 3000
        provider_encrypt_ebs_volume             = true
        provider_instance_size_name             = "M30"
        provider_name                           = "AWS"
        provider_region_name                    = "EU_CENTRAL_1"
        provider_volume_type                    = "STANDARD"
themantissa commented 3 years ago

Thanks to everyone for the input and alerting.

So as you noticed the default IOPS changed for new clusters (and some existing clusters). This was released by Atlas yesterday and obviously this impacted Terraform. It is truly a good thing overall in that the default IOPS is now higher with no increase in cost. The tough thing was the impact this has to IaC use cases and for that we apologize. The team has applied a change to the API so that no matter what IOPS is requested, if it is less than or equal to the default or not possible for that tier as STANDARD, the API will ignore it. This should prevent the errors you encountered.

I have updated the documentation to remove IOPS from the examples as well since it's only needed when one desires (and is able by cluster tier) to use PROVISIONED storage with a higher IOPS than the default.

Furthermore, for existing clusters the default roll out is happening over time. @pitthecat I believe this may be what is happening in your situation, until your existing clusters are upgraded they will not have higher IOPS. The API now accepts the change even if it's not possible to prevent the aforementioned errors.

Thanks!

fyi @shum @nikhil-mongo

themantissa commented 3 years ago

I believe we have corrected all the issues and new docs are out. Thank you all for the help!!