hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.82k stars 9.16k forks source link

[Bug]: Race condition in aws_acm_certificate with aws_acmpca_certificate_authority #31132

Open brsolomon-deloitte opened 1 year ago

brsolomon-deloitte commented 1 year ago

Terraform Core Version

1.4.6.

AWS Provider Version

4.65.0

Affected Resource(s)

Expected Behavior

aws_acm_certificate should wait until the Private Certificate Authority (PCA) is in ACTIVE state before requesting certificate issuance.

Actual Behavior

aws_acm_certificate appears to request a certificate when the CA is in PENDING_CERTIFICATE state. This will cause the certificate to have Status=Failed with reason

"The current state of the private CA is not valid or does not permit the requested operation."

Relevant Error/Panic Output Snippet

=================================
>>CreateCertificateAuthority API[1] - The CA “arn:aws:acm-pca:us-east-2:REDACTED:certificate-authority/REDACTED” was created at 2023-05-02T14:15:51. 
Event ID - 3d6e69e3-6e2f-498a-a490-6da66494f795
=================================

>>ImportCertificateAuthorityCertificate API[2] - The CA was installed at 2023-05-02T14:15:56. The CA was in PENDING_CERTIFICATE state till this time. 
Event ID- df045a04-4c2f-4fe0-837d-3807b0330ddc
=================================

>>RequestCertificate API[3] - The ACM certificate arn:aws:acm:us-east-2:REDACTED:certificate/94779886-109d-43d6-898b-7d66ebd16ee4 was requested at 2023-05-02T14:15:53. 
Event ID - 691895f7-34b4-4681-9964-654b61e958f3
=================================

Consequently, by looking at the event timings of the API calls involved, we are able to observe that the ACM certificate was requested at a timeframe before Private CA was installed and accordingly the Private CA was still in PENDING_CERTIFICATE when the "RequestCertificate" call was made. Hence the failure code “InvalidStateException” was generated with reason “PCA_INVALID_STATE”. Please note that when the CA is in PENDING_CERTIFICATE, it cannot be used to Issue certificates.

Terraform Configuration Files

$ cat acm-pca.tf acm-certificate.tf
# Creates an AWS Certificate Manager Private root Certificate Authority (ACM PCA),
# plus a root certificate and fully signs the CA CSR

# Useful links on the topic of ACM PCA
#
# - https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acmpca_certificate_authority
# - https://github.com/cert-manager/aws-privateca-issuer
# - https://cert-manager.io/docs/usage/ingress/#supported-annotations
# - https://stackoverflow.com/questions/72258984/how-can-i-get-cert-manager-to-use-aws-acm-pca-to-provision-certificates-for-http/72638105#72638105

resource "aws_acmpca_certificate_authority" "REDACTED_ca" {
  # Create root CA
  type = "ROOT"

  certificate_authority_configuration {
    key_algorithm     = "RSA_4096"
    signing_algorithm = "SHA512WITHRSA"

    subject {
      common_name  = var.private_ca_subject_common_name
      organization = var.private_ca_subject_organization
      country      = var.private_ca_subject_country
      locality     = var.private_ca_subject_locality
      state        = var.private_ca_subject_state
    }
  }

  permanent_deletion_time_in_days = 7
}

resource "aws_acmpca_certificate" "REDACTED_root_ca" {
  # Issue self-signed root CA certificate
  certificate_authority_arn   = aws_acmpca_certificate_authority.REDACTED_ca.arn
  certificate_signing_request = aws_acmpca_certificate_authority.REDACTED_ca.certificate_signing_request
  signing_algorithm           = "SHA512WITHRSA"

  # https://docs.aws.amazon.com/privateca/latest/userguide/UsingTemplates.html
  template_arn = "arn:${data.aws_partition.current.partition}:acm-pca:::template/RootCACertificate/V1"

  validity {
    type  = "YEARS"
    value = 3
  }
}

resource "aws_acmpca_certificate_authority_certificate" "REDACTED_ca_cert" {
  # Associates the certificate with an AWS Certificate Manager Private Certificate Authority
  # An ACM PCA Certificate Authority is unable to issue certificates until it has a certificate associated with it.
  # A root level ACM PCA Certificate Authority is able to self-sign its own root certificate.
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn

  certificate       = aws_acmpca_certificate.REDACTED_root_ca.certificate
  certificate_chain = aws_acmpca_certificate.REDACTED_root_ca.certificate_chain
}
# Have the AWS Private Certificate Authority (PCA) issue a private certificate in Amazon Certificate Manager (ACM)
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acm_certificate

resource "aws_acm_certificate" "REDACTED_elb_cert" {
  # Upon creation, the certificate will show Status=Issued, Status=Success for each domain, but In Use=No
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn
  domain_name               = var.domain_name

  subject_alternative_names = [
    "*.${var.domain_name}"
  ]
}

# Allow the ACM service to automatically renew certificates issued by a PCA
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/acmpca_permission

resource "aws_acmpca_permission" "REDACTED_ca_cert_permission" {
  certificate_authority_arn = aws_acmpca_certificate_authority.REDACTED_ca.arn
  actions                   = ["IssueCertificate", "GetCertificate", "ListPermissions"]
  principal                 = "acm.amazonaws.com"
}

Steps to Reproduce

Apply the Terraform modules above (acm-pca.tf and acm-certificate.tf).

Terraform will run to completion "successfully" despite the certificate showing the following:

Screenshot 2023-05-03 at 9 34 57 AM

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

brsolomon-deloitte commented 1 year ago

I have verified definitively that this is the issue.

If I delete and re-create the cert, now that the PCA has been in a ready state for some time, the cert is issued with no problem.

terraform apply -destroy -auto-approve -target='aws_acm_certificate.REDACTED_elb_cert'
terraform apply -auto-approve -target='aws_acm_certificate.REDACTED_elb_cert'
brsolomon-deloitte commented 1 year ago

It does not seem like acm_certificate_validation will resolve this since it is a private certificate where the standard dns/http validation does not apply.

justinretzolk commented 1 year ago

Hey @brsolomon-deloitte 👋 Thank you for taking the time to raise this! If you add a dependency between aws_acm_certificate.REDACTED_elb_cert and aws_acmpca_certificate_authority_certificate.REDACTED_ca_cert (either implicitly or explicitly via depends_on), does that improve the behavior for you?

Right now, the aws_acm_certificate is only dependent on the aws_acmpca_certificate_authority itself, so it's likely getting created before the other resources are.

gowthamakanthan commented 10 months ago

@justinretzolk I've tried the with depends_on as suggested. However the ACM request certificate for the private certificate is still failing with following error.

resource "aws_acmpca_certificate_authority" "private_ca_authority" {
  permanent_deletion_time_in_days = 7
  type                            = "ROOT"
  certificate_authority_configuration {
    key_algorithm     = local.key_algorithm
    signing_algorithm = local.signing_algorithm
    subject {
      common_name  = local.common_name
      organization = local.org
    }
  }
  tags = local.tags
}

resource "aws_acmpca_permission" "private_ca_permission" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  actions                   = ["IssueCertificate", "GetCertificate", "ListPermissions"]
  principal                 = "acm.amazonaws.com"
}

data "aws_partition" "current" {}

resource "aws_acmpca_certificate" "private_ca_cert" {
  certificate_authority_arn   = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate_signing_request = aws_acmpca_certificate_authority.private_ca_authority.certificate_signing_request
  signing_algorithm           = local.signing_algorithm

  template_arn = "arn:${data.aws_partition.current.partition}:acm-pca:::template/RootCACertificate/V1"

  validity {
    type  = "YEARS"
    value = local.private_cert_validity
  }
}

resource "aws_acmpca_certificate_authority_certificate" "pca_authority_cert" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate               = aws_acmpca_certificate.private_ca_cert.certificate
  certificate_chain         = aws_acmpca_certificate.private_ca_cert.certificate_chain
}

resource "aws_acm_certificate" "request_cert" {
  domain_name               = local.common_name
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  key_algorithm             = local.key_algorithm

  tags = local.tags

  lifecycle {
    create_before_destroy = true
  }

  depends_on = [aws_acmpca_certificate_authority.private_ca_authority]
}

Error:

The current state of the private CA is not valid or does not permit the requested operation.

resource "aws_acm_certificate" "request_cert" {
    arn                       = "arn:aws:acm:us-east-1:<>:certificate/ac0c10e9-a84d-4172-b0f9-cf165402cd1e"
    certificate_authority_arn = "arn:aws:acm-pca:us-east-1:<>:certificate-authority/9b42320f-1fb8-45be-98cc-f4d784b95108"
    domain_name               = "domain"
    domain_validation_options = []
    id                        = "arn:aws:acm:us-east-1:<>:certificate/ac0c10e9-a84d-4172-b0f9-cf165402cd1e"
    key_algorithm             = "RSA_2048"
    pending_renewal           = false
    renewal_eligibility       = "INELIGIBLE"
    renewal_summary           = []
    status                    = "FAILED"
    subject_alternative_names = [
        "domain",
    ]

When i manually Install the CA certificate for the PCA, ACM is working as expected. Any suggestions?

gowthamakanthan commented 9 months ago

This is working fine when adding the time wait for the request_cert.

resource "aws_acmpca_certificate_authority_certificate" "pca_authority_cert" {
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  certificate               = aws_acmpca_certificate.private_ca_cert.certificate
  certificate_chain         = aws_acmpca_certificate.private_ca_cert.certificate_chain
}

resource "time_sleep" "wait_30_seconds" {
  create_duration = "30s"
  depends_on      = [aws_acmpca_certificate_authority_certificate.pca_authority_cert]
}

resource "aws_acm_certificate" "request_cert" {
  domain_name               = local.common_name
  certificate_authority_arn = aws_acmpca_certificate_authority.private_ca_authority.arn
  key_algorithm             = local.key_algorithm

  tags = local.tags

  lifecycle {
    create_before_destroy = true
  }

  depends_on = [time_sleep.wait_30_seconds]
}