newrelic / terraform-provider-newrelic

Terraform provider for New Relic
https://registry.terraform.io/providers/newrelic/newrelic/latest/docs
Mozilla Public License 2.0
202 stars 245 forks source link

expected entity tag to have been created but was not found #2633

Closed maathor closed 5 months ago

maathor commented 5 months ago

Hi there,

Thank you for opening an issue. In order to better assist you with your issue, we kindly ask to follow the template format and instructions. Please note that we try to keep the Terraform issue tracker reserved for bug reports and feature requests only. General usage questions submitted as issues will be closed and redirected to New Relic's Explorers Hub https://discuss.newrelic.com/c/build-on-new-relic/developer-toolkit.

Please include the following with your bug report

:warning: Important: Failure to include the following, such as omitting the Terraform configuration in question, may delay resolving the issue.

Terraform Version

Run terraform -v to show the version. If you are not running the latest version of Terraform, please upgrade because your issue may have already been fixed.

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    newrelic = {
      source  = "newrelic/newrelic"
      version = "~> 3.34"
    }
  }
}

Affected Resource(s)

Please list the resources as a list, for example:

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration

Please include your provider configuration (sensitive details redacted) as well as the configuration of the resources and/or data sources related to the bug report.

data "newrelic_entity" "apps" {
  for_each = local.app_config

  name   = each.value.app_name
  domain = "APM"
  type   = "APPLICATION"
}
locals {
  app_config = {
    ads_api = {
      app_name                   = "analytics-ads-api"
      latency_threshold       = "2.0"
    }
  }
  tags = {
    "SubDomain" = title(local.subDomain)
    "Tribe" = title(local.tribe)
  }
tribe = "foo"
subDomain = "bar"
}

resource "newrelic_service_level" "latency" {
  for_each = locals.app_config

  guid = data.newrelic_entity.apps[each.key].guid

  name        = format("%s - %s - Latency", local.tribe, each.value.app_name)
  description = "Proportion of requests that are served faster than a threshold."

  events {
    account_id = var.new_relic_account_id
    valid_events {
      from  = "Transaction"
      where = format("entityGuid = '%s' AND (transactionType = 'Web')", data.newrelic_entity.apps[each.key].guid)
    }
    bad_events {
      from  = "Transaction"
      where = format("entityGuid = '%s' AND (transactionType = 'Web') AND duration < %s", data.newrelic_entity.apps[each.key].guid, try(each.value.latency_threshold, "2.0"))
    }
  }
  objective {
    target = 95.00
    time_window {
      rolling {
        count = 7
        unit  = "DAY"
      }
    }
  }
}

resource "newrelic_entity_tags" "latency_tags" {
  for_each = local.app_config
  guid     = newrelic_service_level.latency[each.key].guid

  dynamic "tag" {
    for_each = local.tags
    content {
      key    = tag.key
      values = [tag.value]
    }
  }
}
resource "newrelic_nrql_alert_condition" "latencies_errors_fastburn_rate" {
  for_each = var.app_config

  account_id = var.new_relic_account_id
  policy_id  = var.new_relic_alert_policy_id
  type       = "static"
  name       = format("%s: Latency (Fast-burn rate)", each.value.app_name)

  description = <<-EOT
  Alerts you when 2% of your SLO error budget based on latencies is spent in 1 hour.
  EOT

  enabled                      = true
  violation_time_limit_seconds = 3600

  nrql {
    query = "FROM Metric SELECT 100 - clamp_max((sum(newrelic.sli.valid) - sum(newrelic.sli.bad)) / sum(newrelic.sli.valid) * 100, 100) AS 'Error rate' WHERE entity.guid = '${newrelic_service_level.latency[each.key].guid}'"
  }

  critical {
    operator              = "above_or_equals"
    threshold             = 3.3600000000000003
    threshold_duration    = 60
    threshold_occurrences = "at_least_once"
  }
  fill_option        = "none"
  aggregation_window = 3600
  aggregation_method = "event_flow"
  aggregation_delay  = 120
  slide_by           = 60
}

Actual Behavior

What actually happened?

Unable To create tags on dedicated resources

│ Error: expected entity tag SubDomain to have been created but was not found
│ 
│   with module.nr_service_level_apm.newrelic_entity_tags.latency_tags["ads_api"],
│   on .terraform/modules/nr_service_level_apm/service_level_latencies.tf line 31, in resource "newrelic_entity_tags" "latency_tags":
│   31: resource "newrelic_entity_tags" "latency_tags" {
│ 
╵
╷
│ Error: expected entity tag SubDomain to have been created but was not found
│ 
│   with module.nr_service_level_apm.newrelic_entity_tags.success_tags["ads_api"],
│   on .terraform/modules/nr_service_level_apm/service_level_sucess.tf line 31, in resource "newrelic_entity_tags" "success_tags":
│   31: resource "newrelic_entity_tags" "success_tags" {

Expected Behavior

What should have happened? Create tags on services levels to able to retreive them from workload that search entities from tags.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply

Debug Output

Please provider a link to a GitHub Gist containing the complete debug output: https://www.terraform.io/docs/internals/debugging.html. Please do NOT paste the debug output in the issue; just paste a link to the Gist.

Panic Output

If Terraform produced a panic, please provide a link to a GitHub Gist containing the output of the crash.log.

Important Factoids

Are there anything atypical about your accounts that we should know? For example: Running in EC2 Classic? Custom version of OpenStack? Tight ACLs?

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:

pranav-new-relic commented 5 months ago

Hi @maathor, thank you for reporting this. This is indeed a strange (and I believe, an intermittent/one time) occurrence - are you seeing this happening with every terraform apply, or does this happen once in a while?

(context: the closed issue you linked, in which the discussion went along these lines, since we identified this was happening because of excessive API traffic, and a workaround suggested was to reduce Terraform parallelism)

maathor commented 5 months ago

This happening with every terraform apply indeed. My gut feeling: newrelic_entity_tags doens't like count or for_each This is quite annoying because i can't order my Service levels by tribes ....

pranav-new-relic commented 5 months ago

Thanks for clarifying, @maathor.

This is kind of perplexing, as I've tried reproducing multifold, via instances of trying to apply tags to multiple entities (GUIDs) using the newrelic_entity_tags resource with count and for_each; for instance, I've tried modifying your configuration by a tiny bit

locals {
  app_config = {
    ads_api = {
      app_name          = "tf_test_072y1fp6ul"
      latency_threshold = "2.0"
    }
    key_two = {
      app_name          = "tf_test_0798lwwy8x"
      latency_threshold = "2.0"
    }
    key_three = {
      app_name          = "tf_test_08rec97sya"
      latency_threshold = "2.0"
    }
    key_four = {
      app_name          = "tf_test_0dv3ypaht7"
      latency_threshold = "2.0"
    }
    key_five = {
      app_name          = "tf_test_0meabgh3wl"
      latency_threshold = "2.0"
    }
    key_six = {
      app_name          = "tf_test_0t2y3w7ddb"
      latency_threshold = "2.0"
    }
    key_seven = {
      app_name          = "tf_test_0wkvhlmfls"
      latency_threshold = "2.0"
    }

  }
  tags = {
    "SubDomain" = title(local.subDomain)
    "Tribe"     = title(local.tribe)
  }
  tribe     = "foo"
  subDomain = "bar"
}

data "newrelic_entity" "apps" {
  for_each = local.app_config

  name   = each.value.app_name
  domain = "APM"
  type   = "APPLICATION"
}

resource "newrelic_service_level" "latency" {
  for_each = local.app_config

  guid = data.newrelic_entity.apps[each.key].guid

  name        = format("%s - %s - Latency", local.tribe, each.value.app_name)
  description = "Proportion of requests that are served faster than a threshold."

  events {
    account_id = XXXXXXX
    valid_events {
      from  = "Transaction"
      where = format("entityGuid = '%s' AND (transactionType = 'Web')", data.newrelic_entity.apps[each.key].guid)
    }
    bad_events {
      from  = "Transaction"
      where = format("entityGuid = '%s' AND (transactionType = 'Web') AND duration < %s", data.newrelic_entity.apps[each.key].guid, try(each.value.latency_threshold, "2.0"))
    }
  }
  objective {
    target = 0
    time_window {
      rolling {
        count = 7
        unit  = "DAY"
      }
    }
  }
}

resource "newrelic_entity_tags" "latency_tags" {
  for_each = local.app_config
  guid     = newrelic_service_level.latency[each.key].guid

  dynamic "tag" {
    for_each = local.tags
    content {
      key    = tag.key
      values = [tag.value]
    }
  }
}

but this works perfectly fine with terraform apply (I've tried running this multiple times). Similarly, I've tried using the following example, this time, trying to add entity tags to dashboard entities

locals {
  apps = [
    "tf-test-1elvw",
    "tf-test-1iiyt",
    "tf-test-3d8hh",
    "tf-test-3hwtr",
    "tf-test-4hzmx",
    "tf-test-70eb4",
    "tf-test-8jg82",
    "tf-test-8l2kr",
    "tf-test-9btif",
    "tf-test-9t4qm",
    "tf-test-akqnc",
    "tf-test-iuccq",
    "tf-test-l19e0",
    "tf-test-mahpp",
    "tf-test-nvf12",
    "tf-test-peydd",
    "tf-test-qpcha",
  ]

  custom_tags = {
    "tag-key-1"  = "tag-value-1"
    "tag-key-2"  = "tag-value-2"
    "tag-key-3"  = "tag-value-3"
    "tag-key-4"  = "tag-value-4"
    "tag-key-5"  = "tag-value-5"
    "tag-key-6"  = "tag-value-6"
    "tag-key-7"  = "tag-value-7"
    "tag-key-8"  = "tag-value-8"
    "tag-key-9"  = "tag-value-9"
    "tag-key-10" = "tag-value-10"
    "SubDomain"  = "SubDomain"
  }
}

data "newrelic_entity" "foo" {
  count  = length(local.apps)
  name   = local.apps[count.index]
  type   = "DASHBOARD"
  domain = "VIZ"
  tag {
    key   = "isDashboardPage"
    value = "false"
  }
}

resource "newrelic_entity_tags" "foo" {
  count = length(local.apps)
  guid  = data.newrelic_entity.foo[count.index].guid

  dynamic "tag" {
    for_each = local.custom_tags
    content {
      key    = tag.key
      values = [tag.value]
    }
  }
}

and this works perfectly fine with terraform apply as well.

image

Apologies about the long comment, but can you please try running terraform apply by reducing parallelism (as described in this comment)? We're not sure on the path forward as this doesn't appear to be a bug (since the newrelic_entity_tags resource does seem to work pretty well with count or for_each) which is why this could possibly be because of API overhead or a network issue, but let us know if reducing parallelism with terraform apply seems to help.

pranav-new-relic commented 5 months ago

@maathor also would it be possible for you to share debug logs (by running terraform apply with TF_LOG=DEBUG or TF_LOG=TRACE) so we can see what's going on on the API front when a terraform apply is performed? Thanks.

18jwong commented 5 months ago

Hi! I think I was having the same issue.

Turns out newrelic_service_level has a specific output sli_guid, which is what we actually want for this use case.

I assume because guid is the input for the entity we want to monitor, guid ends up also outputting the guid of the entity being monitored. In my case (and I assume OPs case) this was a singular entity with multiple newrelic_service_levels, so adding multiple sets of tags was continuously overwriting the previous set of tags on this singular entity.

maathor commented 5 months ago

Hi! I think I was having the same issue.

Turns out newrelic_service_level has a specific output sli_guid, which is what we actually want for this use case.

I assume because guid is the input for the entity we want to monitor, guid ends up also outputting the guid of the entity being monitored. In my case (and I assume OPs case) this was a singular entity with multiple newrelic_service_levels, so adding multiple sets of tags was continuously overwriting the previous set of tags on this singular entity.

this seems fix my issue, indeed. thank you so much

pranav-new-relic commented 5 months ago

Thank you for the assistance with this, @18jwong - it's always good to see all of us within the community helping out each other :)

Good catch with the guid vs sli_guid of the service level - this should be why a lot of folks are experiencing this issue only with service levels (and not very usually with other kinds of entities) :) if this helped fix @maathor's issue, it could be that the scenario you've described is what he must be seeing - however, I don't see how multiple service levels could have been created monitoring the same entity in the example he posted above, since there's only one app in local.app_config and there should have been only one service level created, and subsequently, entity tags linked only to the entity being monitored (and not the service level, since guid was used and sli_guid), which shouldn't have still caused an error. I'm assuming an excerpt of the configuration was attached, but maybe this isn't possible until there's only one app that's being used to create multiple service levels; so it's good to see the issue fixed, though not everything's very clear :)

Anyway, thank you, again, for sharing your inputs here, @18jwong :)