Unable to add read-only node to existing cluster

in4mer commented 4 years ago

Terraform CLI and Terraform MongoDB Atlas Provider Version

% tf version
Terraform v0.12.29
+ provider.aws v3.8.0
+ provider.mongodbatlas v0.6.5
+ provider.random v2.3.0

Terraform Configuration File

resource "mongodbatlas_cluster" "testcluster" {
  count         = local.enable_testing ? 1 : 0
  project_id    = mongodbatlas_project.peer_project.id
  name          = "${local.env}-test-rs${count.index}"
  num_shards    = 1

  cluster_type  = "REPLICASET"

  provider_backup_enabled       = true
  auto_scaling_disk_gb_enabled  = true
  mongo_db_major_version        = "3.6"

  provider_name                 = "AWS"
  disk_size_gb                  = 380
  provider_disk_iops            = 3000
  provider_encrypt_ebs_volume   = true
  provider_instance_size_name   = "M40_NVME"
  provider_region_name          = local.atlas_primary_region

  advanced_configuration {
    oplog_size_mb                 = local.is_prod ? 163840 : null
  }

  replication_specs {
    num_shards  = 1
    zone_name   = "downtown clown brown frown gown hound"
    regions_config {
      region_name     = local.atlas_primary_region
      electable_nodes = 3
      read_only_nodes = 1
      priority        = local.primary_priority
    }
  }
}

Steps to Reproduce

Set read_only_nodes to 0, and create the cluster
Set read_only_nodes to 1, and update the cluster
Observe the error

Expected Behavior

The fourth read_only node should be created

Actual Behavior

tlas_cluster.testcluster[0]: Modifying... [id=Y2x1c3Rlcl9pZA==:NWY3NjI0NDc2ZDlhZmE3NmExZGE4Yz
hl-Y2x1c3Rlcl9uYW1l:c3RhZ2UtdGVzdC1yczA=-cHJvamVjdF9pZA==:NWYyMjQyMWFmNzk2Yjc1YjhiNzViMzdm-cHJvdmlkZX
JfbmFtZQ==:QVdT]                                                                                     

Error: error updating MongoDB Cluster (stage-test-rs0): PATCH https://cloud.mongodb.com/api/atlas/v1.
0/groups/5f22421af796b75b8b75b37f/clusters/stage-test-rs0: 400 (request "Bad Request") Cloud backups 
must be enabled for deployments with NVMe storage.                                                   

  on testcluster.tf line 1, in resource "mongodbatlas_cluster" "testcluster":                        
   1: resource "mongodbatlas_cluster" "testcluster" {

Debug Output

Crash Output

Additional Context

I've set up this test cluster in order to verify a possibly much more serious bug in the underlying atlas machinery, but this bug is blocking.

References

I didn't see any

themantissa commented 4 years ago

@in4mer confirmed the same result. However as far as I can tell this isn't Terraform per se but the API is returning that error. I'll put in an internal ticket in but may help to submit a support ticket as well. I'll leave my sample cluster up so they can look at it if needed.

themantissa commented 4 years ago

Internal ticket: HELP-18987

in4mer commented 4 years ago

Indeed. Thank you for that; I'll also file.

Here's the support link for anyone interested: https://support.mongodb.com/case/00693712

nikhil-mongo commented 4 years ago

@in4mer This is not an issue, but needs some changes in the way terraform configuration has to be written for NVMe clusters.

As per the MongoDB Atlas Cluster API for creating cluster, it states that to set diskIOPS This setting requires that providerSettings.instanceSizeName to be M30 or greater and cannot be used with clusters with local NVMe SSDs.

Therefore, when you define the provider_disk_iops = 3000, this is actually not the value stored in tfstate file and the actual value is provider_disk_iops = 135125. Therefore, when we run the terraform apply for modifying the cluster, it throws error as it tries to modify the IOPS value ~ provider_disk_iops = 135125 -> 3000.

Thus, use the below configuration to run your changes.

resource "mongodbatlas_cluster" "testcluster" {
  count         = local.enable_testing ? 1 : 0
  project_id    = mongodbatlas_project.peer_project.id
  name          = "${local.env}-test-rs${count.index}"
  num_shards    = 1

  cluster_type  = "REPLICASET"

  provider_backup_enabled       = true
  auto_scaling_disk_gb_enabled  = true
  mongo_db_major_version        = "3.6"

  provider_name                 = "AWS"
  disk_size_gb                  = 380
  provider_encrypt_ebs_volume   = true
  provider_instance_size_name   = "M40_NVME"
  provider_region_name          = local.atlas_primary_region

  advanced_configuration {
    oplog_size_mb                 = local.is_prod ? 163840 : null
  }

  replication_specs {
    num_shards  = 1
    zone_name   = "downtown clown brown frown gown hound"
    regions_config {
      region_name     = local.atlas_primary_region
      electable_nodes = 3
      read_only_nodes = 1
      priority        = local.primary_priority
    }
  }
}

However, the error should be more clear to make sure that the proper changes can be made. I will check internally for the possibility to display appropriate error and update the terraform documentation with an example/details on how to deploy an NVMe cluster with terraform.

in4mer commented 4 years ago

@nikhil-mongo OK, thank you for that.

Here's another case that I've run into that's related. Namely, upgrading from M40_NVME to M50_NVME:

mongodbatlas_cluster.testcluster[0]: Modifying... [id=Y2x1c3Rlcl9pZA==:NWY3Y2YzYTcyY2RkYTMyOGMwMGViMjFj-Y2x1c3Rlcl9uYW1l:c3RhZ2UtdGVzdC1yczA=-cHJvamVjdF9pZA==:NWYyMjQyMWFmNzk2Yjc1YjhiNzViMzdm-cHJvdmlkZXJfbmFtZQ==:QVdT]

Error: error updating MongoDB Cluster (stage-test-rs0): PATCH https://cloud.mongodb.com/api/atlas/v1.0/groups/5f22421af796b75b8b75b37f/clusters/stage-test-rs0: 400 (request "Bad Request") The cluster's disk IOPS of 135125 is invalid. For a disk of size 760GB on instance size M50_NVME with a volume type of PROVISIONED, the IOPS must be equal to 6000 for an NVMe cluster of this instance size.

  on testcluster.tf line 1, in resource "mongodbatlas_cluster" "testcluster":
   1: resource "mongodbatlas_cluster" "testcluster" {

That's after creating the cluster as M40_NVME with provider_disk_iops commented out.

Thoughts?

This is related to earlier issue https://github.com/mongodb/terraform-provider-mongodbatlas/issues/283

Please advise how to proceed.

nikhil-mongo commented 4 years ago

@in4mer This will require us to ignore the provider_disk_iops and disk_size_gb from the tfstate file which when compared during terraform apply and passed to the Atlas API causes this. These values are not required for an NVMe cluster as these are fixed values and can be ignored as per the API.

@leofigy @PacoDw Need your inputs here, or let me know if someone else from our team can help.

leofigy commented 4 years ago

Hi @nikhil-mongo, adding @coderGo93 for inputs.

From the provider perspective, the thing related with the inconsistent values for provider_disk_iops, we end up with a "system value" like 135125, and the actual value sent from the user 3000/6000.

So one solution is to add an extra value to the schema , to stored the computed value from the system, and keep the one from the user as arg. But we need to check with Melissa about this approach, or wait to the backend to fix the inconsistency.

themantissa commented 3 years ago

@leofigy since we now have removed provider disk iops from examples and explained one does not need to include it to get the standard iops we shoud see this less. For when someone does submit it we may still want to do the two values you note above so the user doesn't get an error. Does that make sense to you? If so I'll get a ticket created.

leofigy commented 3 years ago

Hi @themantissa , agree, yea we still might need this. to avoid the error

themantissa commented 2 years ago

Error has not come up again. Closing as fixed.

mongodb / terraform-provider-mongodbatlas