Supporterino / truenas-graphite-to-prometheus

A graphite exporter mapping file for truenas scale >23.10.1 metrics and some example grafana dashboards
GNU General Public License v3.0
78 stars 12 forks source link

[Request] Disk Temp & Dataset Free Space #1

Closed SeiyaGame closed 8 months ago

SeiyaGame commented 11 months ago

Hi, thanks for your work! It's really insane I was wondering if it was possible to add the temperature of the disks ? I tried on my side and without success, the names of the disks are weird: image Also, if possible, I'd like to have the free space of my dataset/pool (To see the evolution of my storage space)

Thanks đź‘Ť

mandrepont commented 11 months ago

I had started working on the disk temp side. From what I can tell it is the disk serial number. Here is a mapping I think can be used.


################################################
# Smart log mapping
################################################

- match: "truenas.*.smart_log_smart.disktemp.*.*"
  name: "disk_temp"
  labels:
    job: "truenas"
    instance: "${1}"
    disk_serial: "${2}"
Supporterino commented 11 months ago

Hey there, was looking for disk usage and temp myself. Haven't checked out the smart metrics yet. Need to pull a new set of raw metrics since my test sample was pretty small. Thanks for your findings already. Today was pretty busy for me. But I will get back to working on it tomorrow will take a closer look at your leads and figure something out. Hopefully with good metrics and some graphs on the dashboard.

Supporterino commented 11 months ago

BTW any PRs are totally welcome if you figured it out :)

mandrepont commented 11 months ago

I have been looking through all of the lines I got and don’t see any replacement for dataset usage metrics. That is a disappointment if true. It has been helpful to run the exporter in debug mode for a day and search via Loki and grafana. I’ll keep looking, but I don’t think it’s included in this release.

Supporterino commented 11 months ago

While converting all lines I looked a lot at netdata documentation and netdata is capable of reporting the usage state of zfs pools. I got the input of iXSystem via Reddit that they are looking forward to contributions regarding metrics and exporting them in different formats. So once I am happy with this project and converting the graphite metrics I am going to take a look at the middleware code of truenas since I am also familiar with python.

mandrepont commented 11 months ago

Looking further into the iXSystem side of things. It looks like we can collect disk space from mount path.

I think it would require adding a new plugin to https://github.com/truenas/middleware/blob/stable/cobia/src/middlewared/middlewared/etc_files/netdata/netdata.conf.mako

This one in particular I believe. https://learn.netdata.cloud/docs/data-collection/linux-systems/disk-space

SeiyaGame commented 11 months ago

Looking further into the iXSystem side of things. It looks like we can collect disk space from mount path.

I think it would require adding a new plugin to https://github.com/truenas/middleware/blob/stable/cobia/src/middlewared/middlewared/etc_files/netdata/netdata.conf.mako

This one in particular I believe. https://learn.netdata.cloud/docs/data-collection/linux-systems/disk-space

I've already tried on my own without success The documentation is poor, there is nothing well explained

I modified the /etc/netdata/netdata.conf configuration to activate the plugin:

...
[plugins]
        proc = yes
+       diskspace = yes
        cgroups = no
        tc = no
        idlejitter = no
        perf = no
        apps = no
        nfacct = no
        netdata monitoring = no # We want to disable netdata's agent stats
...

I've tried adding these parameters to the end of the file but it doesn't work, I don't know how to do it ...

[plugin:proc:diskspace:/mnt/*]
#or
[plugin:proc:diskspace:/]
#or
[plugin:proc:diskspace:/mnt]
Supporterino commented 11 months ago

The adjustment of the TrueNAS Netdata and adding new exporters will take place in the new year for me since I am not with my machines and not willing to tinker on them remotely. But thanks for all your input.

The mapping for S.M.A.R.T data and other stuff I find will come soon (tomorrow probably)

Supporterino commented 11 months ago

@mandrepont I found out that I don't even have disk temp in truenas must have been since the upgrade to 23.10. Gonna figure that out first since that is the reason why I missed the metrics. I don't have them :D Gonna keep you guys posted.

mandrepont commented 10 months ago

Weird I was seeing them in my logs and now I am not... On TrueNASCore I got them without even running smart service. Though it seems like it is linked to that service completely.

Supporterino commented 10 months ago

Found an ongoing issue on TrueNAS scale that smart metrics are somewhat broken. Till now there is no fix yet. And since I can't yet find the direct relation to netdata metrics I put the issue on hold. But once I get smart metrics back on my machine I will add them.

Supporterino commented 8 months ago

Temperatures are implemented in https://github.com/Supporterino/truenas-graphite-to-prometheus/releases/tag/v1.1.0 but only with their serial at the moment. Gonna try to map them to their disk names later. Usage is still not exported by TrueNAS

brantje commented 8 months ago

Query to show the disks in a graph: {__name__=~"truenas_truenas_smart_log_smart_disktemp.+"}

Edit: Would it be possible to have de drive serials as label? eg: "truenas_truenas_smart_log_smart_disktemp{sn=xxxxx}

Supporterino commented 8 months ago

You are on an outdated version the metric is names disk_temperature with the serial in the serial label

Supporterino commented 8 months ago

Closing issue now since disk usage isn't reported by truenas anymore