rfmoz / grafana-dashboards

Grafana dashboards
Apache License 2.0
1.1k stars 438 forks source link

Duplicate series with two Raspberry Pis #156

Open artur-borys opened 6 months ago

artur-borys commented 6 months ago

Hi

Recently I've added a new Raspberry Pi node to my hybrid k3s cluster. It has 3 nodes:

Hardware misc section has errors on the RPis for temperature graphs:

Status: 500. Message: execution: found duplicate series for the match group {chip="soc:firmware_raspberrypi_hwmon"} on the right hand-side of the operation: [{__name__="node_hwmon_chip_names", app_kubernetes_io_component="metrics", app_kubernetes_io_instance="prometheus", app_kubernetes_io_managed_by="Helm", app_kubernetes_io_name="prometheus-node-exporter", app_kubernetes_io_part_of="prometheus-node-exporter", app_kubernetes_io_version="1.7.0", chip="soc:firmware_raspberrypi_hwmon", chip_name="rpi_volt", helm_sh_chart="prometheus-node-exporter-4.24.0", instance="10.0.0.15:9100", job="kubernetes-service-endpoints", namespace="grafana", node="rpi-wroc-1", service="prometheus-prometheus-node-exporter"}, {__name__="node_hwmon_chip_names", app_kubernetes_io_component="metrics", app_kubernetes_io_instance="prometheus", app_kubernetes_io_managed_by="Helm", app_kubernetes_io_name="prometheus-node-exporter", app_kubernetes_io_part_of="prometheus-node-exporter", app_kubernetes_io_version="1.7.0", chip="soc:firmware_raspberrypi_hwmon", chip_name="rpi_volt", helm_sh_chart="prometheus-node-exporter-4.24.0", instance="10.0.0.10:9100", job="kubernetes-service-endpoints", namespace="grafana", node="rpi-wroc-0", service="prometheus-prometheus-node-exporter"}];many-to-many matching not allowed: matching labels must be unique on one side

I believe this must be some issue on node-exporter or this dashboard end, because I messed up something else and had to recreate the k3s cluster from scratch and the issue still persists

The error is displayed only for the RPi nodes (the Hetzner node doesn't even have temperature sensors for obvious reasons)

Let me know if you need more details.

If I'll have more free time I'll also take a deeper look at this issue if no one here does that first

jorti commented 6 months ago

I also see this error in my x86_64 servers.

RichardD012 commented 6 months ago

I see the same thing on my x86 servers as well.

axel-lebourhis commented 6 months ago

I'm having similar issue with another series. I'm monitoring 2 devices, a rpi and an odroid, both seem to report a thermal_thermal_zone0 leading to duplicate series. The error is reported by Hardware temperature monitor panel.

found duplicate series for the match group {chip="thermal_thermal_zone0"} on the right hand-side of the operation: [{__name__="node_hwmon_chip_names", chip="thermal_thermal_zone0", chip_name="cpu_thermal", instance="pi.hole:9100", job="node"}, {__name__="node_hwmon_chip_names", chip="thermal_thermal_zone0", chip_name="acpitz", instance="odroid.home:9100", job="node"}];many-to-many matching not allowed: matching labels must be unique on one side
ishioni commented 5 months ago

Same for my nas and gateway - both have an nvme0 device and that's enough to trip it over

artur-borys commented 5 months ago

Huh, it seems like the dashboard listed on Grafana site is not the latest available version. If you import https://github.com/rfmoz/grafana-dashboards/blob/master/prometheus/node-exporter-full.json to Grafana, the issue is resolved