mongodb / terraform-provider-mongodbatlas

Terraform MongoDB Atlas Provider: Deploy, update, and manage MongoDB Atlas infrastructure as code through HashiCorp Terraform
https://registry.terraform.io/providers/mongodb/mongodbatlas
Mozilla Public License 2.0
241 stars 167 forks source link

encryption_at_rest failing with UNEXPECTED ERROR (and discussion of Cloud Provider Access possible improvement) #409

Closed JohnPolansky closed 3 years ago

JohnPolansky commented 3 years ago

Hello!

Thank you for opening an issue. Please note that we try to keep the Terraform MongoDB Atlas Provider issue tracker reserved for bug reports. Please ensure you check open and closed issues first to ensure your issue hasn't already been reported (if it has been reported add a reaction, i.e. +1, to the issue).

Terraform v0.14.6
+ provider registry.terraform.io/hashicorp/aws v3.27.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.0.2
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/time v0.6.0
+ provider registry.terraform.io/mongodb/mongodbatlas v0.8.2
resource "mongodbatlas_encryption_at_rest" "abc-cloud" {
  project_id = mongodbatlas_project.abc-cloud.id

  aws_kms = {
    enabled = true
    customer_master_key_id = aws_kms_key.mongodb-key.key_id
    region                 = var.mongodb.config.region 
    role_id                = mongodbatlas_cloud_provider_access.abc_cloud.role_id
  }
  # depends_on = [aws_kms_key.mongodb-key, aws_iam_role.atlas_kms_access, aws_iam_role_policy_attachment.this, aws_iam_policy.policy ]
  depends_on = [ time_sleep.wait_180_seconds ]
}

resource "mongodbatlas_cloud_provider_access" "abc_cloud" {
  project_id           = mongodbatlas_project.abc-cloud.id
  provider_name        = "AWS"
  iam_assumed_role_arn = "arn:aws:iam::${var.aws.config.account_id}:role/${var.mongodb.config.project}-atlas_kms_access"
}

resource "time_sleep" "wait_180_seconds" {
  depends_on = [aws_iam_policy.policy, aws_iam_role_policy_attachment.this, aws_iam_role.atlas_kms_access, mongodbatlas_cloud_provider_access.abc_cloud]
  create_duration = "180s"
}

Steps to Reproduce

Please list the full steps required to reproduce the issue, for example:

  1. terraform init
  2. terraform apply

Expected Behavior

Mongo Atlas cluster created using encryption_at_rest via an AWS KMS key

Actual Behavior

Error: error creating Encryption At Rest: PATCH https://cloud.mongodb.com/api/atlas/v1.0/groups/6023e525c6e3450d810ac64e/encryptionAtRest: 500 (request "UNEXPECTED_ERROR") Unexpected error.

on mongo-atlas/atlas-cluster.tf line 62, in resource "mongodbatlas_encryption_at_rest" "abc-cloud": 62: resource "mongodbatlas_encryption_at_rest" "abc-cloud" {

Debug Output

module.mongodb.mongodbatlas_team.abc-cloud: Creating... module.mongodb.mongodbatlas_team.abc-cloud: Creation complete after 0s [id=aWQ=:NjAyM2U1MjU3ZTVlNmQyMGE4NjFhNGRk-b3JnX2lk:NWZmMzM0M2Y4NmU0YWI2NTJhZWNlYmZi] module.mongodb.mongodbatlas_project.abc-cloud: Creating... module.mongodb.aws_kms_key.mongodb-key: Creating... module.mongodb.aws_iam_policy.policy: Creating... module.mongodb.aws_iam_policy.policy: Creation complete after 0s [id=arn:aws:iam::124976144193:policy/abc-cloud-jp-atlas-kms-access] module.mongodb.aws_kms_key.mongodb-key: Creation complete after 1s [id=1f83ed89-f002-467e-b846-cab9331bef5c] module.mongodb.aws_kms_alias.mongodb-key: Creating... module.mongodb.aws_kms_alias.mongodb-key: Creation complete after 0s [id=alias/abc-cloud-jp-atlas-enc_at_rest] module.mongodb.mongodbatlas_project.abc-cloud: Creation complete after 3s [id=6023e525c6e3450d810ac64e] module.mongodb.mongodbatlas_maintenance_window.abc-cloud: Creating... module.mongodb.mongodbatlas_cloud_provider_access.abc_cloud: Creating... module.mongodb.mongodbatlas_project_ip_access_list.test: Creating... module.mongodb.mongodbatlas_network_container.abc-cloud: Creating... module.mongodb.mongodbatlas_database_user.atlasAdmin: Creating... module.mongodb.mongodbatlas_database_user.abc-cloud["sample"]: Creating... module.mongodb.mongodbatlas_cloud_provider_access.abc_cloud: Creation complete after 0s [id=aWQ=:NjAyM2U1Mjg2MDYzNWQ2NzUzZTEyZjc4-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl-cHJvdmlkZXJfbmFtZQ==:QVdT] module.mongodb.aws_iam_role.atlas_kms_access: Creating... module.mongodb.mongodbatlas_network_container.abc-cloud: Creation complete after 0s [id=Y29udGFpbmVyX2lk:NjAyM2U1MjgyYjMwYWQ0MmQ0N2M2MDM5-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl] module.mongodb.mongodbatlas_network_peering.abc-cloud: Creating... module.mongodb.mongodbatlas_database_user.abc-cloud["sample"]: Creation complete after 0s [id=YXV0aF9kYXRhYmFzZV9uYW1l:YWRtaW4=-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl-dXNlcm5hbWU=:c2FtcGxlX3VzZXI=] module.mongodb.mongodbatlas_database_user.atlasAdmin: Creation complete after 0s [id=YXV0aF9kYXRhYmFzZV9uYW1l:YWRtaW4=-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl-dXNlcm5hbWU=:YXRsYXNBZG1pbg==] module.mongodb.mongodbatlas_maintenance_window.abc-cloud: Creation complete after 0s [id=6023e525c6e3450d810ac64e] module.mongodb.aws_iam_role.atlas_kms_access: Creation complete after 1s [id=abc-cloud-jp-atlas_kms_access] module.mongodb.aws_iam_role_policy_attachment.this: Creating... module.mongodb.aws_iam_role_policy_attachment.this: Creation complete after 0s [id=abc-cloud-jp-atlas_kms_access-20210210135241071100000001] module.mongodb.time_sleep.wait_30_seconds: Creating... module.mongodb.mongodbatlas_project_ip_access_list.test: Creation complete after 4s [id=ZW50cnk=:MTAuMC4wLjAvMTY=-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl] module.mongodb.mongodbatlas_network_peering.abc-cloud: Still creating... [10s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [10s elapsed] module.mongodb.mongodbatlas_network_peering.abc-cloud: Still creating... [20s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [20s elapsed] module.mongodb.mongodbatlas_network_peering.abc-cloud: Still creating... [30s elapsed] module.mongodb.mongodbatlas_network_peering.abc-cloud: Creation complete after 31s [id=cGVlcl9pZA==:NjAyM2U1MjhjYTAwMmIwZGZkMjYxNDMw-cHJvamVjdF9pZA==:NjAyM2U1MjVjNmUzNDUwZDgxMGFjNjRl-cHJvdmlkZXJfbmFtZQ==:QVdT] module.mongodb.aws_vpc_peering_connection_accepter.abc-cloud-accept: Creating... module.mongodb.time_sleep.wait_30_seconds: Still creating... [30s elapsed] module.mongodb.aws_vpc_peering_connection_accepter.abc-cloud-accept: Creation complete after 1s [id=pcx-080bafb2a703a2e26] module.mongodb.time_sleep.wait_30_seconds: Still creating... [40s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [50s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [1m0s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [1m10s elapsed] module.mongodb.time_sleep.wait_30_seconds: Still creating... [1m20s elapsed] module.mongodb.time_sleep.wait_30_seconds: Creation complete after 1m30s [id=2021-02-10T13:54:11Z] module.mongodb.mongodbatlas_encryption_at_rest.abc-cloud: Creating...

Error: error creating Encryption At Rest: PATCH https://cloud.mongodb.com/api/atlas/v1.0/groups/6023e525c6e3450d810ac64e/encryptionAtRest: 500 (request "UNEXPECTED_ERROR") Unexpected error.

I was previously using the access_key_id / secret_access_key for encryption_at_rest .. however it appears Mongo API recently made this method deprecated and it's being rejected now, the new method appears to use role_id I'm honestly not sure if I'm setting this up right but it's failing at the same place every time. What is weird is if I immediately do a second terraform apply it works perfectly every time, which made me think it was some sorta race condition.. I added lots of depends_on without luck and finally as you can see about I added a sleep before it setups the encryption but it still fails.

One thing of note is when it does fail you can see the following oddity in the Mongo Atlas interface:

image

But if i run the apply again this time you can see the same entry is correctly setup:

image

As you can see in the logs I'm creating the IAM role as part of the mongo atlas setup but you can see it clearly finished more than 90 seconds before we try and setup the encryption. But I wonder if somehow there is still a timing issue perhaps inside the encryption_at_rest piece, since it works perfectly the second time.

themantissa commented 3 years ago

@JohnPolansky first of all let me apologize that you were not notified in the customer communication about this change. We looked and with our criteria for notification you were not alerted via the email messages we sent over the last six months to impacted users. However as you note we changed from IAM credentials to the more secure IAM roles. As part of this one must use the new provider 0.8.2 and use the cloud provider access resources, as explained here: https://registry.terraform.io/providers/mongodb/mongodbatlas/latest/docs/guides/0.8.0-upgrade-guide The 2nd part of the cloud provider access apply can currently not be done in the same apply as the first, hence that's likely the issue here. They can be automated with a flow between two applies but we know it's not easy. This choice was the best when we examined the options and pro/cons with our timeline but we are reviewing this to see if we can rework it to be in a single apply without any major limitations on longer term support/use or if those limitations are worth being able to do in a single apply.

JohnPolansky commented 3 years ago

@themantissa First off thanks for the response and the detailed explanation on the notice. I always appreciate people being open about things so thanks for that. Regarding the Upgrade Guide I can now see on review, that that like you said it does require 2 applies to function, which is unfortunate, I'm really hoping you guys will be able to enhance this in the future but at least I know I'm not crazy. I did have two asks/suggestions:

  1. Is there a Mongo Ticket or something I can subscribe to, and be notified when/if the 2nd apply solution is solved? I'd like to track it so I can circle back to it later.
  2. I totally missed the "Guides" section that you shared with me above, unfortunately I was pouring over all the direct pages like: https://registry.terraform.io/providers/mongodb/mongodbatlas/latest/docs/data-sources/cloud_provider_access .. so I missed that section which is my fault for not checking in the Guides but could I suggest maybe someone put a link on the mongodbatlas_cloud_provider_access to point to the guide so people can clearly know this is a two step process. Unfortunately I spent a fair bit of time struggling with this before emailing here and I think a NOTE: box or something there it might help others find it more easily.

Again thank you for the details on this

themantissa commented 3 years ago

@JohnPolansky thank you for the kind words and suggestions. In response: 1) let's keep this issue open and if we find a better way we'll link the PR here. That will allow you and anyone driving by to see the work clearly. 2) Really solid point, thank you! We can update the docs in releases so I'll include this kind of call to guides in the future versions. Very much appreciate knowing your experience so we can improve and others won't have to go through the same.
Thanks again!

cristianburca commented 3 years ago

Hi @JohnPolansky,

Did you manage to find a workaround?

thanks

JohnPolansky commented 3 years ago

Hi @JohnPolansky, Did you manage to find a workaround?

Well the workaround for now is u must either split your mongo atlas terraform into two separate files and run them one after another or you must simply re-run the terraform script when it fails on the UNEXPECTED_ERROR .. for now I'm just re-running it. I'm hoping a solution will be provided in the future as the workaround is rather troublesome.

John

Nica-Alex commented 3 years ago

This seems to work

themantissa commented 3 years ago

@Nica-Alex correct, we added a workaround example. We are also working on a 2nd set of resources that will allow for a similar experience in a future version.

themantissa commented 3 years ago

https://github.com/mongodb/terraform-provider-mongodbatlas/pull/420 Will provide a 2nd option with the cloud provide access split into two resources to better support all possible use cases. Will go into our next version, ETA April/early-May - subject to usual warnings in case additional testing surfaces issues.

themantissa commented 3 years ago

Version 0.9.0 with a single apply method for Cloud Provider Access has been released. Thank you to all who provided feedback.

JohnPolansky commented 3 years ago

So for those who may find this post I did get the new solution working after some rough starts. The documentation page does have most of the info, but in my cases I kept getting the dreaded error:

│ Error: error creating Encryption At Rest: PATCH https://cloud.mongodb.com/api/atlas/v1.0/groups/608d6c909a42b21b26969a97/encryptionAtRest: 400 (request "CLOUD_PROVIDER_ACCESS_ROLE_NOT_AUTHORIZED") The specified Cloud Provider Access role (608d6c931d08727006e2c618) has not been authorized.

In the end I found out that the problem seems to be a dependency parsing, I had to add a depends_on to the mongodbatlas_encryption_at_rest before the process worked with a single apply and no errors. Here is a snippet of my setup I hope it helps.

resource "mongodbatlas_encryption_at_rest" "default" {
  project_id = mongodbatlas_project.default.id

  aws_kms = {
    enabled                = true
    customer_master_key_id = aws_kms_key.mongodb-key.key_id
    region                 = var.mongodb.config.region
    role_id = mongodbatlas_cloud_provider_access_setup.setup_only.role_id
  }
  depends_on = [mongodbatlas_cloud_provider_access_setup.setup_only, mongodbatlas_cloud_provider_access_authorization.auth_role ]
}

resource "mongodbatlas_cloud_provider_access_setup" "setup_only" {
  project_id    = mongodbatlas_project.default.id
  provider_name = "AWS"
}

resource "mongodbatlas_cloud_provider_access_authorization" "auth_role" {
  project_id = mongodbatlas_cloud_provider_access_setup.setup_only.project_id
  role_id    = mongodbatlas_cloud_provider_access_setup.setup_only.role_id

  aws = {
    iam_assumed_role_arn = "arn:aws:iam::${var.aws.config.account_id}:role/${var.mongodb_project_name}-atlas_kms_access"
  }
}

Good luck

@themantissa maybe we can add the depends_on to the documentation page if you guys agree it's required? Thanks you for the help and resolution.

themantissa commented 3 years ago

@JohnPolansky thank you for the suggestion! @leofigy thoughts on this? I don't remember needing it in testing but seems a very logical add to avoid the aforementioned issue.

leofigy commented 3 years ago

Hi @themantissa , @JohnPolansky it makes sense to me to add the depends_on sometimes terraform is not able to infer the whole dependency tree.

I think terraform without using the depends on, just see the relationship between mongodbatlas_encryption_at_rest and mongodbatlas_cloud_provider_access_setup and does not wait for the authorization that is needed by mongodbatlas_encryption_at_rest