hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.73k stars 9.09k forks source link

[Bug]: reading AWS CloudTrail Trail (XXXXXX-cloudtrail): not found after creation #33176

Open killmepete opened 1 year ago

killmepete commented 1 year ago

Terraform Core Version

1.5.6

AWS Provider Version

aws v5.0.1

Affected Resource(s)

Cloudtrail

Expected Behavior

The cloudtrail module I have created (based of the example) should be able to deploy without issue.

Actual Behavior

I currently have a simple module created to deploy cloudtrail (using the example provided), I'm running into an error which seems to be intermittent where after running a few deploys (with resources nothing to do with Cloudtrail) I'll encounter the error message:

"reading AWS CloudTrail Trail (dev-test-cloudtrail): not found after creation"

I can confirm that if I log into the console or check via the CLI that the trail does exist and is logging as expected, I haven't been able to find any documentation around this error message and the only way I've managed to get around it is by intercepting my build, running a terraform state rm on cloudtrail and redeploying.

I'm inclined to believe this is a bug with the provider, I've deployed and written similar cloudtrail modules before and I've never encountered this problem. If I'm mistaken and it's an easy fix that would make me happy!

Relevant Error/Panic Output Snippet

Error: reading AWS CloudTrail Trail (dev-test-cloudtrail): not found after creation
│ 
│   with module.cloudtrail.aws_cloudtrail.cloudtrail,
│   on ../modules/cloudtrail/main.tf line 1, in resource "aws_cloudtrail" "cloudtrail":
│    1: resource "aws_cloudtrail" "cloudtrail" {

Terraform Configuration Files

resource "aws_cloudtrail" "example" {
  name                          = "example"
  s3_bucket_name                = aws_s3_bucket.example.id
  s3_key_prefix                 = "prefix"
  include_global_service_events = false
}

resource "aws_s3_bucket" "example" {
  bucket        = "tf-test-trail"
  force_destroy = true
}

data "aws_iam_policy_document" "example" {
  statement {
    sid    = "AWSCloudTrailAclCheck"
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["cloudtrail.amazonaws.com"]
    }

    actions   = ["s3:GetBucketAcl"]
    resources = [aws_s3_bucket.example.arn]
    condition {
      test     = "StringEquals"
      variable = "aws:SourceArn"
      values   = ["arn:${data.aws_partition.current.partition}:cloudtrail:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:trail/example"]
    }
  }

  statement {
    sid    = "AWSCloudTrailWrite"
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["cloudtrail.amazonaws.com"]
    }

    actions   = ["s3:PutObject"]
    resources = ["${aws_s3_bucket.example.arn}/prefix/AWSLogs/${data.aws_caller_identity.current.account_id}/*"]

    condition {
      test     = "StringEquals"
      variable = "s3:x-amz-acl"
      values   = ["bucket-owner-full-control"]
    }
    condition {
      test     = "StringEquals"
      variable = "aws:SourceArn"
      values   = ["arn:${data.aws_partition.current.partition}:cloudtrail:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:trail/example"]
    }
  }
}
resource "aws_s3_bucket_policy" "example" {
  bucket = aws_s3_bucket.example.id
  policy = data.aws_iam_policy_document.example.json
}

data "aws_caller_identity" "current" {}

data "aws_partition" "current" {}

data "aws_region" "current" {}

Steps to Reproduce

Unsure, problem happens intermittently and can happen a long time after the cloudtrail has been deployed.

Debug Output

ERROR

│ Error: reading AWS CloudTrail Trail (test-dev-cloudtrail): not found after creation
│ 
│   with module.cloudtrail.aws_cloudtrail.cloudtrail,
│   on ../modules/cloudtrail/main.tf line 1, in resource "aws_cloudtrail" "cloudtrail":
│    1: resource "aws_cloudtrail" "cloudtrail" {
│ 
╵
ERRO[0020] Terraform invocation failed in /root/app/production 
ERRO[0020] 1 error occurred:
        * [/root/app/production] exit status 1

Exited with code exit status 1

PLANNED CHANGES

  # module.guardduty.aws_cloudwatch_event_target.guardduty_event_target will be created
  + resource "aws_cloudwatch_event_target" "guardduty_event_target" {
      + arn            = "arn:aws:lambda:eu-west-2:xxxxxxxxxxxxxxx:function:datadog"
      + event_bus_name = "default"
      + id             = (known after apply)
      + rule           = "guardduty_event_rule"
      + target_id      = "guardduty_event_rule_target"
    }

  # module.guardduty.aws_guardduty_publishing_destination.gd_publishing_destination will be created
  + resource "aws_guardduty_publishing_destination" "gd_publishing_destination" {
      + destination_arn  = "arn:aws:s3:::xxxxxxxxxxxx-guardduty-findings"
      + destination_type = "S3"
      + detector_id      = "xxxxxxxxx"
      + id               = (known after apply)
      + kms_key_arn      = "arn:aws:kms:eu-west-2:xxxxxxxxxx:key/xxxxxxxxxxxxxxxxxxx"
    }

  # module.guardduty.aws_kms_key.gd_encryption_key will be updated in-place
  ~ resource "aws_kms_key" "gd_encryption_key" {
        id                                 = "xxxxxxxxxx"
      ~ policy                             = jsonencode(
          ~ {
              - Id        = "key-default-1"
              ~ Statement = [
                  ~ {
                      ~ Action    = "kms:*" -> "kms:GenerateDataKey"
                      ~ Principal = {
                          - AWS     = "arn:aws:iam::xxxxxxxxx:root"
                          + Service = "guardduty.amazonaws.com"
                        }
                      ~ Resource  = "*" -> "arn:aws:kms:eu-west-2:xxxxxxxxxxxx:key/*"
                      ~ Sid       = "Enable IAM User Permissions" -> "Allow GuardDuty to encrypt findings"
                        # (1 unchanged attribute hidden)
                    },
                ]
                # (1 unchanged attribute hidden)
            }
        )
        tags                               = {}
        # (11 unchanged attributes hidden)
    }

Plan: 2 to add, 1 to change, 0 to destroy.

Panic Output

No response

Important Factoids

Resource names, IDs and so and so have been changed for the purposes of the bug report.

References

The lines responsible for the logging message...

https://github.com/hashicorp/terraform-provider-aws/blob/8aebcb61aa37f33f47606097cce23e5aa1783121/internal/service/cloudtrail/cloudtrail.go#L366-L385

Would you like to implement a fix?

Happy too if it's a confirmed bug and can get some pointers!

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

killmepete commented 1 year ago

Upgraded to the latest version of the provider (5.14.0) and still seeing the same problem.

killmepete commented 1 year ago

Did further testing on this, I have the same issue if I'm importing a trail that I've manually provisioned.

johnjelinek commented 1 year ago

I'm seeing this as well when trying to import an existing trail:

$ terraform import module.security_monitoring.aws_cloudtrail.security_monitoring_cloudtrail 'arn:aws:cloudtrail:us-west-2:<redacted>:trail/security_monitoring_cloudtrail'

module.security_monitoring.aws_cloudtrail.security_monitoring_cloudtrail: Import prepared!
  Prepared aws_cloudtrail for import

Error: Cannot import non-existent remote object

$ terraform version
Terraform v0.12.31
+ provider.archive v2.4.0
+ provider.aws v4.67.0

$ aws cloudtrail list-trails | jq '.Trails[] | select(.Name == "security_monitoring_cloudtrail")'
{
  "TrailARN": "arn:aws:cloudtrail:us-west-2:<redacted>:trail/security_monitoring_cloudtrail",
  "Name": "security_monitoring_cloudtrail",
  "HomeRegion": "us-west-2"
}

EDIT: I think the problem here is that my provider is set to region eu-central-1. Ya, that seemed to be it. All seems fixed now.

DominicBortmes commented 1 year ago

Did further testing on this, I have the same issue if I'm importing a trail that I've manually provisioned.

I also faced the error message reading AWS CloudTrail Trail (XXXXXX-cloudtrail): not found after creation when trying to import an existing trail (terraform version 1.5.5 and aws provider 5.10).

After some fiddling, I figured the tf provider used a role that didn't have any permissions to CRUD on Cloudtrail resources (say the aws_cloudtrail of concern). After adding adequate permissions to the role, the error seems gone (I actually didn't execute the import, yet, since we have a two-step approach with tfmigrate cli, but the error disappeared).

Maybe the bug is that the missing permissions don't get detected and/or the error message misleads? I tried to find the correlating cloudtrail event for the tf initated API call but couldn't find.

@killmepete : maybe you have the same permission issue? This wouldn't change the fact that the cli errer statement seems misleading or buggy, but it could unblock your development.

killmepete commented 1 year ago

@DominicBortmes Good suggestion! Will give that a go ASAP and give feedback.

killmepete commented 1 year ago

@DominicBortmes Giving the IAM principal 'cloudtrail:*' seemed to do the trick so must be a permissions issue, I couldn't see what API call was responsible in cloudtrail unfortunately.

borys86 commented 12 months ago

I can confirm the same issue of CloudTrail being created but terraform errors that resource not found Terraform v1.5.7 AWS provider 5.16.1

user used has administrator role.

borys86 commented 12 months ago

setup we have includes organizational structure with multiple accounts. We want to create CloudTrail in a separate account, not a main/root account. I do suspect that if resource's "aws_cloudtrail" is_organization_trail property is set to true then when you execute code to create Cloud Trail in that dedicated Cloud Trail account actuall Cloud trail resource is created in the root / main account. And if is_organization_trail is set to false then Cloud trail is being created in that local account against which terraform is run.

How to verify: if you have your setup with multi-account setup run same some but change only is_organization_trailand see the results. Go to UI and check what ARN is shown. for mw account id differs depending on this flag value. I suspect that this causes the issue because depending on the flag GET command should use differnet account id in the call.

image image
borys86 commented 12 months ago
image

from https://docs.aws.amazon.com/awscloudtrail/latest/userguide/creating-trail-organization.html

luismy commented 11 months ago

I don't know if this helps @killmepete but I run into this issue and we ended up using the following permissions in the IAM role used by terrraform to be able to provision and to decommission AWS CloudTrails:

                "cloudtrail:CreateTrail",
                "cloudtrail:DeleteTrail",
                "cloudtrail:UpdateTrail",
                "cloudtrail:StartLogging",
                "cloudtrail:StopLogging",
                "cloudtrail:DescribeTrails",
        "cloudtrail:GetTrailStatus",
                "cloudtrail:GetEventSelectors",
                "cloudtrail:LookupEvents",
                "cloudtrail:AddTags",
                "cloudtrail:ListTags",
                "cloudtrail:RemoveTags",

For us, the culprit for AWS CloudTrail Trail (XXXXXX-cloudtrail): not found after creation was the cloudtrail:DescribeTrails missing permission. Once we added that one, we started to get better error messages to add the others ones (i.e. GetTrailStatus, GetEventSelectors, ...)

seanturner026 commented 7 months ago

I too have the terraform import issue. My workspace had multiple providers in different regions with assume_role{} blocks, so I couldn't use the admin role.

As such I created a blank workspace and imported the cloudtrail. From there, I copied the resource object out of state and into my existing workspace's state manually