hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.87k stars 9.21k forks source link

[Bug]: Race condition in aws_acm_certificate with aws_acmpca_certificate_authority #31132

Open brsolomon-deloitte opened 1 year ago

brsolomon-deloitte commented 1 year ago

Terraform Core Version

1.4.6.

AWS Provider Version

4.65.0

Affected Resource(s)

Expected Behavior

aws_acm_certificate should wait until the Private Certificate Authority (PCA) is in ACTIVE state before requesting certificate issuance.

Actual Behavior

aws_acm_certificate appears to request a certificate when the CA is in PENDING_CERTIFICATE state. This will cause the certificate to have Status=Failed with reason

"The current state of the private CA is not valid or does not permit the requested operation."

Relevant Error/Panic Output Snippet

=================================
>>CreateCertificateAuthority API[1] - The CA “arn:aws:acm-pca:us-east-2:REDACTED:certificate-authority/REDACTED” was created at 2023-05-02T14:15:51. 
Event ID - 3d6e69e3-6e2f-498a-a490-6da66494f795
=================================

>>ImportCertificateAuthorityCertificate API[2] - The CA was installed at 2023-05-02T14:15:56. The CA was in PENDING_CERTIFICATE state till this time. 
Event ID- df045a04-4c2f-4fe0-837d-3807b0330ddc
=================================

>>RequestCertificate API[3] - The ACM certificate arn:aws:acm:us-east-2:REDACTED:certificate/94779886-109d-43d6-898b-7d66ebd16ee4 was requested at 2023-05-02T14:15:53. 
Event ID - 691895f7-34b4-4681-9964-654b61e958f3
=================================

Consequently, by looking at the event timings of the API calls involved, we are able to observe that the ACM certificate was requested at a timeframe before Private CA was installed and accordingly the Private CA was still in PENDING_CERTIFICATE when the "RequestCertificate" call was made. Hence the failure code “InvalidStateException” was generated with reason “PCA_INVALID_STATE”. Please note that when the CA is in PENDING_CERTIFICATE, it cannot be used to Issue certificates.

Terraform Configuration Files

$ cat acm-pca.tf acm-certificate.tf
# Creates an AWS Certificate Manager Private root Certificate Authority (ACM PCA),
# plus a root certificate and fully signs the CA CSR

# Useful links on the topic of ACM PCA
#
# - https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acmpca_certificate_authority
# - https://github.com/cert-manager/aws-privateca-issuer
# - https://cert-manager.io/docs/usage/ingress/#supported-annotations
# - https://stackoverflow.com/questions/72258984/how-can-i-get-cert-manager-to-use-aws-acm-pca-to-provision-certificates-for-http/72638105#72638105

resource "aws_acmpca_certificate_authority" "REDACTED_ca" {
  # Create root CA
  type = "ROOT"

  certificate_authority_configuration {
    key_algorithm     = "RSA_4096"
    signing_algorithm = "SHA512WITHRSA"

    subject {
      common_name  = var.private_ca_subject_common_name
      organization = var.private_ca_subject_organization
      country      = var.private_ca_subject_country
      locality     = var.private_ca_subject_locality
      state        = var.private_ca_subject_state
    }
  }

  permanent_deletion_time_in_days = 7
}

resource "aws_acmpca_certificate" "REDACTED_root_ca" {
  # Issue self-signed root CA certificate
  certificate_authority_arn   = aws_acmpca_certificate_authority.REDACTED_ca.arn
  certificate_signing_request = aws_acmpca_certificate_authority.REDACTED_ca.certificate_signing_request
  signing_algorithm           = "SHA512WITHRSA"

  # https://docs.aws.amazon.com/privateca/latest/userguide/UsingTemplates.html
  template_arn = "arn:${data.aws_partition.current.partition}:acm-pca:::template/RootCACertificate/V1"

  validity {
    type  = "YEARS"
    value = 3
  }
}

resource "aws_acmpca_certificate_authority_certificate" "REDACTED_ca_cert" {
  # Associates the certificate with an AWS Certificate Manager Private Certificate Authority
  # An ACM PCA Certificate Authority is unable to issue certificates until it has a certificate associated with it.
  # A root level ACM PCA Certificate Authority is able to self-sign its own root certificate.
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn

  certificate       = aws_acmpca_certificate.REDACTED_root_ca.certificate
  certificate_chain = aws_acmpca_certificate.REDACTED_root_ca.certificate_chain
}
# Have the AWS Private Certificate Authority (PCA) issue a private certificate in Amazon Certificate Manager (ACM)
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acm_certificate

resource "aws_acm_certificate" "REDACTED_elb_cert" {
  # Upon creation, the certificate will show Status=Issued, Status=Success for each domain, but In Use=No
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn
  domain_name               = var.domain_name

  subject_alternative_names = [
    "*.${var.domain_name}"
  ]
}

# Allow the ACM service to automatically renew certificates issued by a PCA
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acmpca_permission

resource "aws_acmpca_permission" "REDACTED_ca_cert_permission" {
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn
  actions                   = ["IssueCertificate", "GetCertificate", "ListPermissions"]
  principal                 = "acm.amazonaws.com"
}

Steps to Reproduce

Apply the Terraform modules above (acm-pca.tf and acm-certificate.tf).

Terraform will run to completion "successfully" despite the certificate showing the following:

Screenshot 2023-05-03 at 9 34 57 AM

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

brsolomon-deloitte commented 1 year ago

I have verified definitively that this is the issue.

If I delete and re-create the cert, now that the PCA has been in a ready state for some time, the cert is issued with no problem.

terraform apply -destroy -auto-approve -target='aws_acm_certificate.REDACTED_elb_cert'
terraform apply -auto-approve -target='aws_acm_certificate.REDACTED_elb_cert'
brsolomon-deloitte commented 1 year ago

It does not seem like acm_certificate_validation will resolve this since it is a private certificate where the standard dns/http validation does not apply.

justinretzolk commented 1 year ago

Hey @brsolomon-deloitte 👋 Thank you for taking the time to raise this! If you add a dependency between aws_acm_certificate.REDACTED_elb_cert and aws_acmpca_certificate_authority_certificate.REDACTED_ca_cert (either implicitly or explicitly via depends_on), does that improve the behavior for you?

Right now, the aws_acm_certificate is only dependent on the aws_acmpca_certificate_authority itself, so it's likely getting created before the other resources are.

gowthamakanthan commented 11 months ago

@justinretzolk I've tried the with depends_on as suggested. However the ACM request certificate for the private certificate is still failing with following error.

resource "aws_acmpca_certificate_authority" "private_ca_authority" {
  permanent_deletion_time_in_days = 7
  type                            = "ROOT"
  certificate_authority_configuration {
    key_algorithm     = local.key_algorithm
    signing_algorithm = local.signing_algorithm
    subject {
      common_name  = local.common_name
      organization = local.org
    }
  }
  tags = local.tags
}

resource "aws_acmpca_permission" "private_ca_permission" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  actions                   = ["IssueCertificate", "GetCertificate", "ListPermissions"]
  principal                 = "acm.amazonaws.com"
}

data "aws_partition" "current" {}

resource "aws_acmpca_certificate" "private_ca_cert" {
  certificate_authority_arn   = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate_signing_request = aws_acmpca_certificate_authority.private_ca_authority.certificate_signing_request
  signing_algorithm           = local.signing_algorithm

  template_arn = "arn:${data.aws_partition.current.partition}:acm-pca:::template/RootCACertificate/V1"

  validity {
    type  = "YEARS"
    value = local.private_cert_validity
  }
}

resource "aws_acmpca_certificate_authority_certificate" "pca_authority_cert" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate               = aws_acmpca_certificate.private_ca_cert.certificate
  certificate_chain         = aws_acmpca_certificate.private_ca_cert.certificate_chain
}

resource "aws_acm_certificate" "request_cert" {
  domain_name               = local.common_name
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  key_algorithm             = local.key_algorithm

  tags = local.tags

  lifecycle {
    create_before_destroy = true
  }

  depends_on = [aws_acmpca_certificate_authority.private_ca_authority]
}

Error:

The current state of the private CA is not valid or does not permit the requested operation.

resource "aws_acm_certificate" "request_cert" {
    arn                       = "arn:aws:acm:us-east-1:<>:certificate/ac0c10e9-a84d-4172-b0f9-cf165402cd1e"
    certificate_authority_arn = "arn:aws:acm-pca:us-east-1:<>:certificate-authority/9b42320f-1fb8-45be-98cc-f4d784b95108"
    domain_name               = "domain"
    domain_validation_options = []
    id                        = "arn:aws:acm:us-east-1:<>:certificate/ac0c10e9-a84d-4172-b0f9-cf165402cd1e"
    key_algorithm             = "RSA_2048"
    pending_renewal           = false
    renewal_eligibility       = "INELIGIBLE"
    renewal_summary           = []
    status                    = "FAILED"
    subject_alternative_names = [
        "domain",
    ]

When i manually Install the CA certificate for the PCA, ACM is working as expected. Any suggestions?

gowthamakanthan commented 10 months ago

This is working fine when adding the time wait for the request_cert.

resource "aws_acmpca_certificate_authority_certificate" "pca_authority_cert" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate               = aws_acmpca_certificate.private_ca_cert.certificate
  certificate_chain         = aws_acmpca_certificate.private_ca_cert.certificate_chain
}

resource "time_sleep" "wait_30_seconds" {
  create_duration = "30s"
  depends_on      = [aws_acmpca_certificate_authority_certificate.pca_authority_cert]
}

resource "aws_acm_certificate" "request_cert" {
  domain_name               = local.common_name
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  key_algorithm             = local.key_algorithm

  tags = local.tags

  lifecycle {
    create_before_destroy = true
  }

  depends_on = [time_sleep.wait_30_seconds]
}