Closed jay-kinder closed 2 months ago
It looks like a bug in the provider code and will be forwarded to the monitoring service team to fix.
@jay-kinder , can you please provide the result for terraform state show google_monitoring_dashboard.dashboard
? Thanks.
Command:
tf state show 'module.dashboards.module.dashboards.google_monitoring_dashboard.dashboard["Default SRE Dashboard"]'
Output:
# module.dashboards.module.dashboards.google_monitoring_dashboard.dashboard["Default SRE Dashboard"]:
resource "google_monitoring_dashboard" "dashboard" {
dashboard_json = jsonencode(
{
displayName = "Default SRE Dashboard"
etag = "03b9b9542fef7ea215cfea91c74df95d"
gridLayout = {
columns = "2"
widgets = [
{
scorecard = {
gaugeView = {
upperBound = 100
}
thresholds = [
{
color = "RED"
direction = "ABOVE"
value = 95
},
{
color = "YELLOW"
direction = "ABOVE"
value = 90
},
{
color = "YELLOW"
direction = "BELOW"
},
]
timeSeriesQuery = {
timeSeriesQueryLanguage = <<-EOT
{ t_0:
fetch k8s_node
| metric 'kubernetes.io/node/cpu/core_usage_time'
| ${project_id}
| ${location}
| ${cluster_name}
| align rate(1m)
| every 1m
| group_by [], [value_core_usage_time_aggregate: aggregate(value.core_usage_time)];
t_1:
fetch k8s_node
| metric 'kubernetes.io/node/cpu/total_cores'
| ${project_id}
| ${location}
| ${cluster_name}
| group_by 1m, [value_total_cores_mean: mean(value.total_cores)]
| every 1m
| group_by [], [value_total_cores_mean_aggregate: sum(value_total_cores_mean)]}
| join
| window 5m
| value
[v_0:
cast_units(
div(t_0.value_core_usage_time_aggregate,
t_1.value_total_cores_mean_aggregate) * 100,
"%")]
EOT
}
}
title = "CPU Utilisation"
},
{
scorecard = {
gaugeView = {
upperBound = 100
}
thresholds = [
{
color = "RED"
direction = "ABOVE"
value = 95
},
{
color = "YELLOW"
direction = "ABOVE"
value = 90
},
]
timeSeriesQuery = {
timeSeriesQueryLanguage = <<-EOT
{ t_0:
fetch k8s_node
| metric 'kubernetes.io/node/memory/used_bytes'
| ${project_id}
| ${location}
| ${cluster_name}
| group_by 1m, [value_used_bytes_mean: mean(value.used_bytes)]
| every 1m
| group_by [],
[value_used_bytes_mean_aggregate: aggregate(value_used_bytes_mean)]
;
t_1:
fetch k8s_node
| metric 'kubernetes.io/node/memory/total_bytes'
| ${project_id}
| ${location}
| ${cluster_name}
| group_by 1m, [value_total_bytes_mean: mean(value.total_bytes)]
| every 1m
| group_by [],
[value_total_bytes_mean_aggregate: aggregate(value_total_bytes_mean)]}
| join
| window 5m
| value
[v_0:
cast_units(
div(t_0.value_used_bytes_mean_aggregate,
t_1.value_total_bytes_mean_aggregate) * 100,
"%")]
EOT
}
}
title = "Memory Utilisation"
},
]
}
labels = {
managed_by_terraform = ""
}
name = "projects/610588521293/dashboards/58e8b192-fb09-4979-a162-4cf8ebea9e88"
}
)
id = "projects/610588521293/dashboards/58e8b192-fb09-4979-a162-4cf8ebea9e88"
project = "sre-central-monitoring"
}
Thanks, @jay-kinder . Before running the second terraform plan
, have you updated the resource configuration?
I tried to reproduce the issue with the resource state you provided and didn't see the provider crash error.
resource "google_monitoring_dashboard" "dashboard" {
dashboard_json = jsonencode(
{
displayName = "Default SRE Dashboard"
gridLayout = {
columns = "2"
widgets = [
{
scorecard = {
gaugeView = {
upperBound = 100
}
thresholds = [
{
color = "RED"
direction = "ABOVE"
value = 95
},
{
color = "YELLOW"
direction = "ABOVE"
value = 90
},
{
color = "YELLOW"
direction = "BELOW"
},
]
timeSeriesQuery = {
timeSeriesQueryLanguage = <<-EOT
{ t_0:
fetch k8s_node
| metric 'kubernetes.io/node/cpu/core_usage_time'
| "terraform-dev-zhenhuali"
| align rate(1m)
| every 1m
| group_by [], [value_core_usage_time_aggregate: aggregate(value.core_usage_time)];
t_1:
fetch k8s_node
| metric 'kubernetes.io/node/cpu/total_cores'
| group_by 1m, [value_total_cores_mean: mean(value.total_cores)]
| every 1m
| group_by [], [value_total_cores_mean_aggregate: sum(value_total_cores_mean)]}
| join
| window 5m
| value
[v_0:
cast_units(
div(t_0.value_core_usage_time_aggregate,
t_1.value_total_cores_mean_aggregate) * 100,
"%")]
EOT
}
}
title = "CPU Utilisation"
},
{
scorecard = {
gaugeView = {
upperBound = 100
}
thresholds = [
{
color = "RED"
direction = "ABOVE"
value = 95
},
{
color = "YELLOW"
direction = "ABOVE"
value = 90
},
]
timeSeriesQuery = {
timeSeriesQueryLanguage = <<-EOT
{ t_0:
fetch k8s_node
| metric 'kubernetes.io/node/memory/used_bytes'
| group_by 1m, [value_used_bytes_mean: mean(value.used_bytes)]
| every 1m
| group_by [],
[value_used_bytes_mean_aggregate: aggregate(value_used_bytes_mean)]
;
t_1:
fetch k8s_node
| metric 'kubernetes.io/node/memory/total_bytes'
| group_by 1m, [value_total_bytes_mean: mean(value.total_bytes)]
| every 1m
| group_by [],
[value_total_bytes_mean_aggregate: aggregate(value_total_bytes_mean)]}
| join
| window 5m
| value
[v_0:
cast_units(
div(t_0.value_used_bytes_mean_aggregate,
t_1.value_total_bytes_mean_aggregate) * 100,
"%")]
EOT
}
}
title = "Memory Utilisation"
},
]
}
labels = {
managed_by_terraform = ""
}
}
)
}
Hello
If I run it with no changes then it will work as expected
However, if I add another widget through the enabled_widgets variable, I will receive the provider crash. It seems to only happen if I want to make changes after the initial deployment.
Thanks
@jay-kinder, thanks for the information. I was trying to add a new widget, and then terraform apply
succeeded. I cannot reproduce this issue. I wonder if I missed something to reproduce it. I didn't use module.
Hi
No probs thanks for looking into it...I can give you some more info to try to reproduce:
external module main.tf
:
resource "google_monitoring_dashboard" "dashboard" {
for_each = { for k, v in var.dashboards : k => v
if length(var.dashboards) > 0 }
project = var.project_id
dashboard_json = jsonencode({
"displayName" : "${each.key}",
"labels" : { "managed_by_terraform" : "" },
"${each.value.layout}Layout" : {
"columns" : each.value.columns,
"widgets" : [
concat([for widget in each.value.enabled_widgets :
jsondecode(file("${path.module}/widgets/${widget}.json"))
])]
}
})
}
external module vars:
variable "project_id" { description = "The project to create resources in" }
variable "dashboards" {
description = "Map of dashboards with widgets enabled"
type = map(object({
enabled_widgets = list(string)
layout = optional(string, "grid")
columns = optional(string, "2")
}))
default = {}
}
external module example widget:
{
"title": "VM CPU Utilisation",
"xyChart": {
"chartOptions": {
"mode": "COLOR"
},
"dataSets": [
{
"minAlignmentPeriod": "60s",
"plotType": "LINE",
"targetAxis": "Y1",
"timeSeriesQuery": {
"timeSeriesFilter": {
"aggregation": {
"alignmentPeriod": "60s",
"crossSeriesReducer": "REDUCE_MEAN",
"groupByFields": [],
"perSeriesAligner": "ALIGN_MEAN"
},
"filter": "metric.type=\"compute.googleapis.com/instance/cpu/utilization\" resource.type=\"gce_instance\""
}
}
}
],
"thresholds": [],
"yAxis": {
"label": "",
"scale": "LINEAR"
}
}
}
module call:
module "dashboards" {
# source = "../../infra-modules/dashboards" # used for local testing
source = "git::ssh://git@github.com/Cloud-Technology-Solutions/sre-central-monitoring-modules//dashboards/"
project_id = var.project
dashboards = {
"Default SRE Dashboard" = {
enabled_widgets = [
"cpu_util",
"mem_util"
]
}
}
}
main.tf:
module "dashboards" {
source = "./dashboards"
project = local.project
}
hope that helps!
Hello, @jay-kinder , thank you for providing the detailed configuration. I can reproduce it.
The issue is that the syntax is wrong when concat widgets. concat is not needed here. It should be
"widgets" : [for widget in each.value.enabled_widgets :
jsondecode(file("${path.module}/widgets/${widget}.json"))
]
Resource config
resource "google_monitoring_dashboard" "dashboard" {
for_each = { for k, v in var.dashboards : k => v
if length(var.dashboards) > 0 }
project = var.project_id
dashboard_json = jsonencode({
"displayName" : "${each.key}",
"labels" : { "managed_by_terraform" : "" },
"${each.value.layout}Layout" : {
"columns" : each.value.columns,
"widgets" : [for widget in each.value.enabled_widgets :
jsondecode(file("${path.module}/widgets/${widget}.json"))
]
}
})
}
concat doc https://developer.hashicorp.com/terraform/language/functions/concat
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Community Note
Terraform Version
1.7.2
Affected Resource(s)
google_monitoring_dashboard
Terraform Configuration
Debug Output
Expected Behavior
To read the terraform resource and show no changes (or a change in place, if required)
Actual Behavior
When first running this code, it will work perfectly and add the dashboard with the right widgets.
However, any following plans (unless it causes a recreation) will cause a provider crash.
I have tried 5.18 and 5.16 provider also, and received the same crash.
Steps to reproduce
terraform plan
(after initial creation)Important Factoids
No response
References
Example Widget:
b/327438769