Open Jjungs7 opened 10 months ago
I can reproduce the problem, on elasticsearch_exporter 1.7.0, with ES 8.12.0 and ES 7.17;
By setting only cluster.routing.allocation.disk.watermark.low
, the unmarshall error happens, but it's ok on another cluster, and it fails there with cluster.routing.allocation.disk.watermark.flood_stage
:thinking:
It would be really helpful to have an example of the API response from elasticsearch to use in our tests. cluster_settings_test.go has tests to cover the cluster settings endpoint including the watermark metrics. From what I understand, based on the provided error message, the problem is that instead of a string, there is now a nested json object being returned for the watermark metrics.
This could be a scenario where we need to use different structs based on elasticsearch version or customize the unmarshal.
Hello! Here is an extract of a simple _cluster/settings
route on a ES 7.17 cluster where I have the problem:
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"disk" : {
"watermark" : {
"low" : "88%",
"flood_stage" : "100%",
"high" : "93%"
}
}
}
}
}
}
}
AFAIK, the format/hierarchy is the same with ES8, and haven't changed for ages :thinking:
Ooooh, by querying it with include_defaults=true
(like the exporter), we can see the following in the defaults
section:
ES 7.17:
"disk" : {
"threshold_enabled" : "true",
"watermark" : {
"enable_for_single_data_node" : "false",
"flood_stage" : {
"frozen" : "95%",
"frozen.max_headroom" : "20GB"
}
},
"include_relocations" : "true",
"reroute_interval" : "60s"
ES 8.12:
"disk" : {
"threshold_enabled" : "true",
"reroute_interval" : "60s",
"watermark" : {
"flood_stage" : {
"frozen" : "95%",
"frozen.max_headroom" : "20GB",
"max_headroom" : "-1"
},
"high" : {
"max_headroom" : "-1"
},
"low" : {
"max_headroom" : "-1"
},
"enable_for_single_data_node" : "true"
}
},
So we can have an entry in persistent
as cluster.routing.allocation.disk.watermark.low
, and in defaults
as cluster.routing.allocation.disk.watermark.low.max_headroom
, can the problem comes from here?
I suspect this is the same bug here, but we are seeing it with another metric in the clustersettings collector:
json: cannot unmarshal object into Go struct field clusterSettingsWatermark.defaults.cluster.routing.allocation.disk.watermark.flood_stage of type string
We are on version ES 7.17.6 and the exporter version is 1.7.0. And this is the JSON response to /_cluster/settings
:
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"disk" : {
"watermark" : {
"low" : "92%",
"flood_stage" : "97%",
"high" : "95%"
}
}
}
},
"max_shards_per_node" : "2000"
},
"xpack" : {
"monitoring" : {
"collection" : {
"enabled" : "true"
}
}
}
},
"transient" : { }
}
Related issue: #509
I was testing elasticsearch_exporter on elasticsearch-8.x and found unmarshal error with
--collector.clustersettings
flag onrun elasticsearchv8 instance
docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" -e"xpack.security.enabled=false" elasticsearch:8.11.3
Change default settings. This removes the "cluster.routing.allocation.disk.watermark.low" field in the Defaults section
curl -XPUT -H"Content-Type: application/json" http://localhost:9200/_cluster/settings -d'{"transient":{"cluster.routing.allocation.disk.watermark.low": "90%"}}'
HELP elasticsearch_clustersettings_allocation_watermark_low_ratio Low watermark for disk usage as a ratio.
TYPE elasticsearch_clustersettings_allocation_watermark_low_ratio gauge
elasticsearch_clustersettings_allocation_watermark_low_ratio 0.9
level=error ts=2023-12-25T09:56:11.492789Z caller=collector.go:189 msg="collector failed" name=clustersettings duration_seconds=0.015642208 err="json: cannot unmarshal object into Go struct field clusterSettingsWatermark.defaults.cluster.routing.allocation.disk.watermark.low of type string"
// clusterSettingsResponse is a representation of a Elasticsearch Cluster Settings type clusterSettingsResponse struct { Defaults clusterSettingsSection
json:"defaults"
Persistent clusterSettingsSectionjson:"persistent"
Transient clusterSettingsSectionjson:"transient"
}// clusterSettingsSection is a representation of a Elasticsearch Cluster Settings type clusterSettingsSection struct { ClusterMaxShardsPerNode string
json:"cluster.max_shards_per_node"
ClusterRoutingAllocationBalanceDiskUsage stringjson:"cluster.routing.allocation.balance.disk_usage"
ClusterRoutingAllocationBalanceIndex stringjson:"cluster.routing.allocation.balance.index"
ClusterRoutingAllocationBalanceShard stringjson:"cluster.routing.allocation.balance.shard"
ClusterRoutingAllocationBalanceThreshold stringjson:"cluster.routing.allocation.balance.threshold"
ClusterRoutingAllocationBalanceWriteLoad stringjson:"cluster.routing.allocation.balance.write_load"
ClusterRoutingAllocationEnable stringjson:"cluster.routing.allocation.enable"
ClusterRoutingAllocationDiskThresholdEnabled stringjson:"cluster.routing.allocation.disk.threshold_enabled"
ClusterRoutingAllocationDiskWatermarkFloodStage stringjson:"cluster.routing.allocation.disk.watermark.flood_stage"
ClusterRoutingAllocationDiskWatermarkHigh stringjson:"cluster.routing.allocation.disk.watermark.high"
ClusterRoutingAllocationDiskWatermarkLow stringjson:"cluster.routing.allocation.disk.watermark.low"
}... u := c.u.ResolveReference(&url.URL{Path: "_cluster/settings"}) q := u.Query() q.Set("flat_settings", "true") q.Set("include_defaults", "true") ...