Terraform Apply is not carrying out implicit Refresh for some cases

ashwgupt commented 1 year ago

Terraform Version

Terraform v1.3.5
on darwin_amd64

Terraform Configuration Files

terraform {
  backend "s3" {}
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "4.30.0"
    }
    archive = {
      source  = "hashicorp/archive"
      version = "2.0.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.0.0"
    }
    template = {
      source  = "hashicorp/template"
      version = "2.2.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.0.1"
    }
    local = {
      source  = "hashicorp/local"
      version = "2.1.0"
    }
  }
  required_version = ">= 1.3.5"
}

resource "aws_cloudwatch_event_rule" "security_events" {
  name          = "security-events"
  event_pattern = {
    "source" : [
      "aws.iam"
    ],
    "detail-type" : [
      "AWS API Call via CloudTrail"
    ],
    "detail" : {
      "eventSource" : [
        "iam.amazonaws.com"
      ],
      "eventName" : [
        "CreateAccessKey"
      ]
    }
  }
}

resource "aws_cloudwatch_event_target" "forward_to_receiver" {
  rule     = "security-events"
  arn      = "arn:aws:events:${region}:${account_id}:event-bus/default"
  role_arn = aws_iam_role.security_event_sender_role.arn
}

resource "aws_iam_role" "security_event_sender_role" {
  name               = "security-event-sender-role"
  description        = "Allows to send cloud trail events to the account"
  assume_role_policy = data.aws_iam_policy_document.security_event_sender_assume_role_policy_document.json
}

data "aws_iam_policy_document" "security_event_sender_assume_role_policy_document" {
  statement {
    sid    = "AllowEventsToBeForwarded"
    effect = "Allow"
    actions = [
      "sts:AssumeRole"
    ]
    principals {
      identifiers = [
        "events.amazonaws.com"
      ]
      type = "Service"
    }
  }
}

resource "aws_iam_role_policy_attachment" "security_event_sender_policy_attachment" {
  policy_arn = aws_iam_policy.security_event_sender_policy.arn
  role       = aws_iam_role.security_event_sender_role.name
}

resource "aws_iam_policy" "security_event_sender_policy" {
  name   = "security-event-sender-policy"
  policy = data.aws_iam_policy_document.security_event_sender_policy_document.json
}

data "aws_iam_policy_document" "security_event_sender_policy_document" {
  statement {
    sid    = "AllowSendEventsResponder"
    effect = "Allow"
    actions = [
      "events:PutEvents"
    ]
    resources = [
      "arn:aws:events:${region}:${account_id}:event-bus/default"
    ]
  }
}

Debug Output

N/A

Expected Behavior

In our ways of working, we constantly try to evaluated and discover if any drift in TF managed resources configuration outside of Terraform.

For that, we run a home grown script that carries out below steps:

Download the existing state file from remote backend
Run a Terraform Refresh
Download the updated file from remote backend
Compare the 2 files, and report if any differences
Restore back the state file downloaded in step #1 back to the remote backend

Previously we use to run terraform refresh post each deployment to our resources. However after a latest change and deprecation notice on the command, we stopped running Refresh after deployments. This was inline with the Terraform documentation that claims the refresh being run implicitly during the terraform apply command itself.

Basis that, the expected behaviour for us was to not expect any drift for the cases where no change to resources/state is made outside of Terraform.

Actual Behavior

However in recent case where we have added new resources, the Terraform state file lacked information on dependencies of some of those resources, post deployment, and hence our Drift Detection reported a drift due to the mismatch in state file.

For ref, the diff found between the state file, as after terraform apply command and the one after terraform apply -refresh-only command is as below: {'identified_drift_0': {'module': 'module.security_events', 'type': 'aws_cloudwatch_event_target', 'name': 'us_east_1_forward_to_receiver', 'resource_index': '96', 'difference': {'instances': {'1': {'dependencies': {'$insert': [[2, 'module.security_events.data.aws_iam_policy_document.security_event_sender_assume_role_policy_document']]}}}}}}

Steps to Reproduce

terraform init
terraform apply
download the state file
run terraform apply -refresh-only
download the updated state file
Compare the 2 state files for the differences

Additional Context

No response

References

No response

jbardin commented 1 year ago

Hi @ashwgupt,

The refresh command has been deprecated, but only because you can now accomplish the same thing using the normal plan+apply workflow. The utility of refreshing the state via -refresh-only has not changed, and may be what you need in some cases.

Technically apply does not, nor ever did refresh any resources. The apply command only applies what was recorded in the plan, and the creation of the plan is where existing resources are refreshed by default. Terraform does not refresh resources after running apply, because the provider should return the most up to date state at that moment. If the provider is not returning a consistent state for some resources, that is an issue with that specific provider.

Given your described workflow, it sounds like the provider is returning resources which may be altered slightly with the first refresh after applying. If that is the case, then using -refresh-only would be a workaround to prevent seeing any unexpected differences in a later plan. Note that running plan -refresh-only already gives you a plan with the comparison between the two states in question, so manual diff'ing of the state may not be necessary.

We use GitHub issues for tracking bugs and enhancements, rather than for questions. While we can sometimes help with certain simple problems here, it's better to use the community forum where there are more people ready to help.

Thanks!

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform