[Bug]: panic: runtime error: invalid memory address or nil pointer dereference

j0rzsh commented 3 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Provider Version

v1.17.4

Terraform Version

v1.9.2

Terraform Edition

Terraform Open Source (OSS)

Current Behavior

Stack trace from the terraform-provider-mongodbatlas_v1.17.4 plugin:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xd15c152]

goroutine 67 [running]:
github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration.NewTFThresholdConfigModel(0xc000b96ba0, {0x0?, 0x0?, 0xd3696f6?})
        github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration/model_alert_configuration.go:239 +0x92
github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration.NewTFAlertConfigurationModel(_, _)
        github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration/model_alert_configuration.go:107 +0x27b
github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration.(*alertConfigurationRS).Read(0xc000a3b3c0?, {0xdda3298, 0xc000929c50}, {{{{0xddaaae0, 0xc000b6e2d0}, {0xdb13360, 0xc000b65b00}}, {0xddae2a8, 0xc0008d00f0}}, 0xc000b6a008, ...}, ...)
        github.com/mongodb/terraform-provider-mongodbatlas/internal/service/alertconfiguration/resource_alert_configuration.go:440 +0x3c5
github.com/hashicorp/terraform-plugin-framework/internal/fwserver.(*Server).ReadResource(0xc0004641e0, {0xdda3298, 0xc000929c50}, 0xc000929cb0, 0xc000b7f620)
        github.com/hashicorp/terraform-plugin-framework@v1.10.0/internal/fwserver/server_readresource.go:117 +0x84e
github.com/hashicorp/terraform-plugin-framework/internal/proto6server.(*Server).ReadResource(0xc0004641e0, {0xdda3298?, 0xc000929b60?}, 0xc0006f4d40)
        github.com/hashicorp/terraform-plugin-framework@v1.10.0/internal/proto6server/server_readresource.go:55 +0x38e
github.com/hashicorp/terraform-plugin-mux/tf6muxserver.(*muxServer).ReadResource(0xc00070c230, {0xdda3298?, 0xc000929890?}, 0xc0006f4d40)
        github.com/hashicorp/terraform-plugin-mux@v0.16.0/tf6muxserver/mux_server_ReadResource.go:35 +0x193
github.com/hashicorp/terraform-plugin-go/tfprotov6/tf6server.(*server).ReadResource(0xc00070ab40, {0xdda3298?, 0xc000980540?}, 0xc00054e070)
        github.com/hashicorp/terraform-plugin-go@v0.23.0/tfprotov6/tf6server/server.go:784 +0x309
github.com/hashicorp/terraform-plugin-go/tfprotov6/internal/tfplugin6._Provider_ReadResource_Handler({0xdd29aa0, 0xc00070ab40}, {0xdda3298, 0xc000980540}, 0xc0008da000, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.23.0/tfprotov6/internal/tfplugin6/tfplugin6_grpc.pb.go:482 +0x1a6
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001a3400, {0xdda3298, 0xc0009804b0}, {0xddabdc0, 0xc000003080}, 0xc000990240, 0xc0005a25d0, 0xea24188, 0x0)
        google.golang.org/grpc@v1.63.2/server.go:1369 +0xdf8
google.golang.org/grpc.(*Server).handleStream(0xc0001a3400, {0xddabdc0, 0xc000003080}, 0xc000990240)
        google.golang.org/grpc@v1.63.2/server.go:1780 +0xe8b
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.63.2/server.go:1019 +0x8b
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 25
        google.golang.org/grpc@v1.63.2/server.go:1030 +0x125

Error: The terraform-provider-mongodbatlas_v1.17.4 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

Terraform configuration to reproduce the issue

# I have two configurations and the only difference is alarms.
# I was able to use it until now without changes in the provider version.
# I am able to correctly use the project without alarms.

# Alarms module:
## General settings
variable "project_id" {
  type = string
}

variable "enabled" {
  type    = bool
  default = true
}

variable "event_type" {
  type = string
}

## Notification settings
variable "slack_integration_id" {
  type = string
}

variable "delay_min" {
  type        = number
  default     = 0
  description = "Number of minutes to wait after an alert condition is detected before sending out the first notification."
}

variable "interval_min" {
  type        = number
  default     = 60
  description = "Number of minutes to wait between successive notifications for unacknowledged alerts that are not resolved. The minimum value is 5."
}

## Alarm settings
variable "matcher" {
  type = object({
    field_name = string
    operator   = string
    value      = string
  })
  default = null
}
variable "threshold_config" {
  type = object({
    operator  = string
    threshold = number
    units     = string
  })
  default = null
}

variable "metric_threshold_config" {
  type = object({
    metric_name = string
    mode        = string
    operator    = string
    threshold   = number
    units       = string
  })
  default = null
}

resource "mongodbatlas_alert_configuration" "this" {
  project_id = var.project_id
  event_type = var.event_type
  enabled    = var.enabled

  notification {
    type_name      = "SLACK"
    integration_id = var.slack_integration_id
    delay_min      = var.delay_min
    interval_min   = var.interval_min
  }

  dynamic "matcher" {
    for_each = var.matcher != null ? { 1 : 1 } : {}

    content {
      field_name = var.matcher.field_name
      operator   = var.matcher.operator
      value      = var.matcher.value
    }
  }

  dynamic "threshold_config" {
    for_each = var.threshold_config != null ? { 1 : 1 } : {}

    content {
      operator  = var.threshold_config.operator
      threshold = var.threshold_config.threshold
      units     = var.threshold_config.units
    }
  }

  dynamic "metric_threshold_config" {
    for_each = var.metric_threshold_config != null ? { 1 : 1 } : {}

    content {
      metric_name = var.metric_threshold_config.metric_name
      mode        = var.metric_threshold_config.mode
      operator    = var.metric_threshold_config.operator
      threshold   = var.metric_threshold_config.threshold
      units       = var.metric_threshold_config.units
    }
  }
}

terraform {
  required_providers {
    mongodbatlas = {
      source = "mongodb/mongodbatlas"
    }
  }
}

Module call:

module "alarms" {
  source   = "../../modules/mongodb_atlas_alarm"
  for_each = { for index, alarm in var.alarms : index => alarm }

  project_id           = mongodbatlas_project.main.id
  slack_integration_id = local.slack_integration_id

  event_type              = each.value.event_type
  matcher                 = each.value.matcher
  threshold_config        = each.value.threshold_config
  metric_threshold_config = each.value.metric_threshold_config
}

alarms = [
  {
    event_type = "HOST_DOWN"
  },
  {
    event_type = "HOST_HAS_INDEX_SUGGESTIONS"
  },
  {
    event_type = "NO_PRIMARY"
  },
  {
    event_type = "OUTSIDE_METRIC_THRESHOLD"
    metric_threshold_config = {
      metric_name = "SYSTEM_MEMORY_PERCENT_USED"
      mode        = "AVERAGE"
      operator    = "GREATER_THAN"
      threshold   = 85
      units       = "RAW"
    }
  },
  {
    event_type = "OUTSIDE_METRIC_THRESHOLD"
    metric_threshold_config = {
      metric_name = "NORMALIZED_SYSTEM_CPU_USER"
      mode        = "AVERAGE"
      operator    = "GREATER_THAN"
      threshold   = 85
      units       = "RAW"
    }
  }
]

Steps To Reproduce

Just run a terraform plan and the provider will crash. It was working previously, same versions, same code.

Removing from the state and the code the alarms and running terraform plan work as expected.

Logs

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct

github-actions[bot] commented 3 months ago

Thanks for opening this issue! Please make sure you've followed our guidelines when opening the issue. In short, to help us reproduce the issue we need:

Terraform configuration file used to reproduce the issue
Terraform log files from the run where the issue occurred
Terraform Atlas provider version used to reproduce the issue
Terraform version used to reproduce the issue
Confirmation if Terraform OSS, Terraform Cloud, or Terraform Enterprise deployment

The ticket CLOUDP-263634 was created for internal tracking.

oarbusi commented 3 months ago

Hi @j0rzsh, Thanks for opening the issue. I am not able to reproduce the issue with the Terraform configuration you provided. Please could you provide the following info so that we can further investigate:

Before the issue appeared, were you using a previous version of the provider? Did this happen after updating to 1.17.4 from another version or all this has happened while using 1.17.4?
Could you provide the terraform state?

Thanks for your collaboration

j0rzsh commented 3 months ago

Hello @oarbusi! Sorry yesterday was bank holiday and wasn't able to post.

So... this stopped happening just the same as it started. I suspect some kind of change in Atlas API that was returning an unwanted value and provider was failing.

Same code same versions stopped working and after 2 or 3 hours started working again. Should we close the issue? I think provider should be able to handle an unwanted value from the API but not sure how it is implemented as golang is not my best skill :)

Many thanks!

oarbusi commented 3 months ago

Thanks @j0rzsh for the update! Glad to hear it's working. Since we are unable to reproduce the issue, creating a fix is hard. I will close the issue but also will keep an eye on this if it ever happens again. Please open an issue if you bump into any other issue so that we can help

j0rzsh commented 3 months ago

@oarbusi Just to add a little more info because I think I discovered the reason. So the problem is I was importing manually created alerts, and playing with them to see how the module was working and how the MongoDB API named alerts.

This panic is easily reproduced creating an alert like this (with the code above):

alarms = [
  {
    event_type = "HOST_DOWN"
  }
]

Then manually go to the MongoDB website, change the alarm to: "Replica Set - Number of unhealthy members is above 1" for example and then run terraform apply again.

As the state has an alarm that does not have matcher, threshold_config nor metric_threshold_config, and the alarm in Atlas has it, provider fails because it expects something else.

So this is a bit of a corner case, but still is not the intended behaviour (I want that my terraform apply make Atlas to be aligned with my code even if someone manually modified something using the website).

If everything is managed using terraform, everything appears to be working fine (I have created a bunch of alerts, then reordered the list and applied without any issues).

oarbusi commented 3 months ago

@j0rzsh thanks for the extra details! I have been able to reproduce your issue. I have created this PR fixing it. It should be released in the next release.

j0rzsh commented 3 months ago

Manny many thanks!

oarbusi commented 3 months ago

@j0rzsh the fix for this issue has been released in v1.17.5. Thanks again for opening the issue

mongodb / terraform-provider-mongodbatlas