Supporterino / truenas-graphite-to-prometheus

A graphite exporter mapping file for truenas scale >23.10.1 metrics and some example grafana dashboards
GNU General Public License v3.0
78 stars 12 forks source link

Drive temperatures not being mapped #12

Closed brantje closed 8 months ago

brantje commented 8 months ago

Hi! Thanks for the config, it's really helpful.

However, the mapped drives temperatures don't get exported for me.

Docker compose.

version: '3'
networks:
  monitor-net:
    driver: bridge
  proxy:
    driver: bridge
services:
  graphite-exporter:
    image: prom/graphite-exporter
    mem_limit: 512M
    container_name: graphite-exporter
    ports:
      - 9109:9109/udp
      - 9109:9109
      - 9108:9108
    restart: unless-stopped
    command: --graphite.mapping-config=/tmp/graphite_mapping.conf --log.level=debug
    volumes:
      - ./graphite-exporter/graphite_mapping.conf:/tmp/graphite_mapping.conf
    networks:
      - monitor-net

Container logs upon start:

ts=2024-03-02T11:49:12.364Z caller=main.go:83 level=info msg="Starting graphite_exporter" version_info="(version=0.15.0, branch=HEAD, revision=8e5d197e521c0dac49bbb38b61e6197b3c28f947)"
ts=2024-03-02T11:49:12.364Z caller=main.go:84 level=info build_context="(go=go1.21.4, platform=linux/amd64, user=root@ffb2f351f7cf, date=20231206-12:58:08, tags=netgo)"
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.ram.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.mem.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.swap.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ops.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext_ops.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_backlog.*.backlog
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_busy.*.busy
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_util.*.utilization
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_mops.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext_mops.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_iotime.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext_iotime.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_qops.*.operations
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_await.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext_await.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_avgsz.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_ext_avgsz.*.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.disk_svctm.*.svctm
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.io.*
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.intr.interrupts
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.cpu.*.softirq
ts=2024-03-02T11:49:12.366Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.ctxt.switches
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.cputemp.temperatures.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.cpu.core_throttling.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.cpu.cpufreq.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.forks.started
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.processes.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.active_processes.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.uptime.uptime
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.clock_sync_state.state
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.clock_status.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.clock_sync_offset.offset
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.load.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.nfsd.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.zfs.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_speed.*.speed
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_duplex.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_operstate.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_carrier.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_mtu.*.mtu
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_packets.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_errors.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.net_drops.*.*
ts=2024-03-02T11:49:12.367Z caller=fsm.go:314 level=warn msg="backtracking required because of match. Performance may be degraded" match=truenas.*.system.net.*
ts=2024-03-02T11:49:12.367Z caller=tls_config.go:274 level=info msg="Listening on" address=[::]:9108
ts=2024-03-02T11:49:12.367Z caller=tls_config.go:277 level=info msg="TLS is disabled." http2=false address=[::]:9108

The only thing i noticed is the rule for mapping the drives missing.

Logs from incoming lines:

ts=2024-03-02T11:51:29.422Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.ZR9053V3.ZR9053V3\", Name:\"truenas truenas_truenas_smart_log_smart_disktemp_ZR9053V3_ZR9053V3\", Value:41, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.422Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.2402E88D465F.2402E88D465F 40.0000000 1709380080"
ts=2024-03-02T11:51:29.422Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.2402E88D465F.2402E88D465F\", Name:\"Graphite metric truenas_truenas_smart_log_smart_disktemp_2402E88D465F_2402E88D465F\", Value:40, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.422Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.2402E88D4794.2402E88D4794 40.0000000 1709380080"
ts=2024-03-02T11:51:29.422Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.2402E88D4794.2402E88D4794\", Name:\"Graphite metric truenas_truenas_smart_log_smart_disktemp_2402E88D4794_2402E88D4794\", Value:40, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.422Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.ZR9053TC.ZR9053TC 40.0000000 1709380080"
ts=2024-03-02T11:51:29.422Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.ZR9053TC.ZR9053TC\", Name:\"truenas truenas_truenas_smart_log_smart_disktemp_ZR9053TC_ZR9053TC\", Value:40, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.422Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.ZR9053QT.ZR9053QT 39.0000000 1709380080"
ts=2024-03-02T11:51:29.422Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.ZR9053QT.ZR9053QT\", Name:\"truenas truenas_truenas_smart_log_smart_disktemp_ZR9053QT_ZR9053QT\", Value:39, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.422Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.2331E86533AC.2331E86533AC 39.0000000 1709380080"
ts=2024-03-02T11:51:29.423Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.2331E86533AC.2331E86533AC\", Name:\"Graphite metric truenas_truenas_smart_log_smart_disktemp_2331E86533AC_2331E86533AC\", Value:39, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.423Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.ZR9053PX.ZR9053PX 41.0000000 1709380080"
ts=2024-03-02T11:51:29.423Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.ZR9053PX.ZR9053PX\", Name:\"truenas truenas_truenas_smart_log_smart_disktemp_ZR9053PX_ZR9053PX\", Value:41, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.423Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.2330E861DBF0.2330E861DBF0 38.0000000 1709380080"
ts=2024-03-02T11:51:29.423Z caller=collector.go:183 level=debug msg="Processing sample" sample="collector.graphiteSample{OriginalName:\"truenas.truenas.smart_log_smart.disktemp.2330E861DBF0.2330E861DBF0\", Name:\"Graphite metric truenas_truenas_smart_log_smart_disktemp_2330E861DBF0_2330E861DBF0\", Value:38, Type:2, Timestamp:time.Date(2024, time.March, 2, 11, 48, 0, 0, time.Local)}"
ts=2024-03-02T11:51:29.423Z caller=collector.go:131 level=debug msg="Incoming line" line="truenas.truenas.smart_log_smart.disktemp.nvme0n1.nvme0n1 39.0000000 1709380080"
Supporterino commented 8 months ago

Disk temperature was added in 1.1 since it was initially bugged in TrueNAS. https://github.com/Supporterino/truenas-graphite-to-prometheus/releases/tag/v1.1.0 you just need to update the image.

brantje commented 8 months ago

Updated to the latest image:

  graphite-exporter:
    image: ghcr.io/supporterino/truenas-graphite-to-prometheus:latest
    mem_limit: 512M
    container_name: graphite-exporter
    ports:
      - 9109:9109/udp
      - 9109:9109
      - 9108:9108
    restart: unless-stopped
    networks:
      - monitor-net

However the disk_temperature doesn't show:

image

But the plain metric does get picked up by graphite: image

brantje commented 8 months ago

Fixed it, for me i needed to change the regex to this

- match: "truenas.*.smart_log_smart.disktemp.*.*"
  name: "disk_temperature"
  labels:
    job: "truenas"
    instance: "${1}"
    serial: "${2}"

All works fine now :)

image

Supporterino commented 8 months ago

Interesting your key varies from mine. Will also take your variant into the main file. Will be included in v1.1.1

Supporterino commented 8 months ago

https://github.com/Supporterino/truenas-graphite-to-prometheus/releases/tag/v1.1.1