hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.81k stars 9.15k forks source link

aws_synthetics_canary fails with `The role defined for the function cannot be assumed by Lambda` #21394

Closed pbzdyl closed 2 years ago

pbzdyl commented 3 years ago

Community Note

Terraform CLI and Terraform AWS Provider Version

Terraform v0.14.7 Terraform AWS Provider 3.63.0

Affected Resource(s)

Terraform Configuration Files

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

resource "aws_s3_bucket" "canary-runs-artifacts" {
  bucket = "taxamo-${var.environment-name}-${data.aws_region.current.name}-canary-artifacts"
  tags   = var.tags

  lifecycle_rule {
    enabled = true
    id      = "expireOldCanaryArtifacts"
    expiration {
      days = 60
    }
  }
}

data "aws_iam_policy_document" "canary" {
  statement {
    actions   = ["s3:GetBucketLocation"]
    resources = [aws_s3_bucket.canary-runs-artifacts.arn]
  }
  statement {
    actions   = ["s3:PutObject"]
    resources = ["${aws_s3_bucket.canary-runs-artifacts.arn}/canary/${data.aws_region.current.name}/*"]
  }
  statement {
    actions = [
      "logs:CreateLogStream",
      "logs:PutLogEvents",
      "logs:CreateLogGroup",
    ]
    resources = [
      join(":", [
        "arn:aws:logs",
        data.aws_region.current.name,
        data.aws_caller_identity.current.account_id,
        "log-group:/aws/lambda/cwsyn-*",
      ])
    ]
  }
  statement {
    actions   = ["s3:ListAllMyBuckets"]
    resources = ["*"]
  }
  statement {
    actions   = ["cloudwatch:PutMetricData"]
    resources = ["*"]
    condition {
      test     = "StringEquals"
      values   = ["CloudWatchSynthetics"]
      variable = "cloudwatch:namespace"
    }
  }
}

resource "aws_iam_policy" "canary" {
  name     = "${var.environment-name}-${data.aws_region.current.name}-canary"
  policy   = data.aws_iam_policy_document.canary.json
}

data "aws_iam_policy_document" "canary-task-assume-role" {
  statement {
    actions = ["sts:AssumeRole"]

    principals {
      type        = "Service"
      identifiers = ["lambda.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "canary" {
  name               = "${var.environment-name}-${data.aws_region.current.name}-canary"
  assume_role_policy = data.aws_iam_policy_document.canary-task-assume-role.json
}

resource "aws_iam_role_policy_attachment" "canary" {
  policy_arn = aws_iam_policy.canary.arn
  role       = aws_iam_role.canary.name
}

resource "aws_synthetics_canary" "canary" {
  name                     = "mycanary"
  artifact_s3_location     = "s3://${aws_s3_bucket.canary-runs-artifacts.bucket}"
  execution_role_arn       = aws_iam_role.canary.arn
  handler                  = "myCanary.handler"
  zip_file                 = "..."
  runtime_version          = "syn-nodejs-puppeteer-3.2"
  success_retention_period = 31
  failure_retention_period = 31
  start_canary             = true
  schedule {
    expression = "rate(1 minute)"
  }
}

Expected Behavior

aws_synthetics_canary created successfully.

Actual Behavior

terraform apply fails with:

Error: error waiting for Synthetics Canary (mycanary) create: unexpected state 'ERROR', wanted target 'READY'. last error: CREATE_FAILED: The role defined for the function cannot be assumed by Lambda. (Service: AWSLambda; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: 608c687b-6322-407e-97c3-4763456dfb82; Proxy: null)

Then I have to manually delete Synthetics Canary resource in AWS (as it is left in an Error state) and run terraform plan && terraform apply again to get the canary created.

References

I found https://github.com/hashicorp/terraform-provider-aws/issues/18101 but it seems it doesn't work/help with my issue.

ewbankkit commented 3 years ago

@pbzdyl Thanks for raising this issue. How long is Terraform waiting for during resource creation before you get this error?

pbzdyl commented 3 years ago

Hi @ewbankkit!

aws_synthetics_canary.mycanary: Still creating... [2m0s elapsed]

Error: error waiting for Synthetics Canary (mycanary) create: unexpected state 'ERROR', wanted target 'READY'. last error: CREATE_FAILED: The role defined for the function cannot be assumed by Lambda. (Service: AWSLambda; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: 608c687b-6322-407e-97c3-4763456dfb82; Proxy: null)

I noticed that AWS creates the canary but it goes into the ERROR state. Shouldn't Terraform in this case delete the faulty canary and try to create it again?

ewbankkit commented 2 years ago

We see similar errors in CI:

=== RUN   TestAccSyntheticsCanary_basic
=== PAUSE TestAccSyntheticsCanary_basic
=== CONT  TestAccSyntheticsCanary_basic
canary_test.go:27: Step 1/3 error: Error running apply: exit status 1
Error: error waiting for Synthetics Canary (tf-acc-test-2abpyxia) create: unexpected state 'ERROR', wanted target 'READY'. last error: CREATE_FAILED: The role defined for the function cannot be assumed by Lambda. (Service: AWSLambda; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: baeae60d-b055-41da-bcee-0b78a8dc51af; Proxy: null)
on terraform_plugin_test.tf line 92, in resource "aws_synthetics_canary" "test":
92: resource "aws_synthetics_canary" "test" {
--- FAIL: TestAccSyntheticsCanary_basic (140.12s)

We need to investigate upping the 2m wait time for IAM role propagation.

github-actions[bot] commented 2 years ago

This functionality has been released in v3.65.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

rholland-i360 commented 2 years ago

I noticed it waits for 4 minutes while creating the canary, but I still got this bug. My workaround is to make it wait for 5 minutes:

resource "time_sleep" "wait_5_minutes" {
  depends_on = [resource.aws_iam_role.main]
  create_duration = "5m"
}

and in the canary: depends_on = [time_sleep.wait_5_minutes]

rajaie-sg commented 2 years ago

Still running into this on the latest AWS provider hashicorp/aws v3.71.0

github-actions[bot] commented 2 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.