munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.99k stars 474 forks source link

[hddtemp_smartctl|smart_] use names from `disk-by-id` instead of `sd*` #1472

Open quotengrote opened 2 years ago

quotengrote commented 2 years ago

Hi,

is it possible to use name from dev/disk/by-[ui]id instead of the ever changing names like /dev/sdb.

Atm those names can change after rebooting and/or changing the hardware. After that the historic data would belong to another device.

It ist a problem in smart_ and hddtemp_smartctl.

Greetings mg

sumpfralle commented 2 years ago

Using stable field identifiers is always a good idea, thus I would appreciate that.

If this affects only contrib plugins, then we should boldly just break the fieldnames once (a core plugin may deserve more thoughts).

Could you prepare a pull request for that?

quotengrote commented 2 years ago

I can, but i just discovered that both Plugins are coming from master instead of contrib.

Is that a Problem?

dbalnaves commented 2 years ago

I can understand the reasoning of using uuid as it does not change but I think that any persistent source from /dev/disk would be fine. Personally, I'd like a configurable option to use by-path as a source:

/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:bastion-boot-lun-1-part1
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:bastion-boot-lun-1
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:monitor-boot-lun-1-part1
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:monitor-boot-lun-1-part5
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:monitor-boot-lun-1-part2
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:monitor-boot-lun-1
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:mail-boot-lun-2-part1
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:mail-boot-lun-2-part5
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:mail-boot-lun-2-part2
/dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:mail-boot-lun-2
/dev/disk/by-path/pci-0000:00:1f.2-ata-1
/dev/disk/by-path/pci-0000:00:1f.5-ata-1-part1
/dev/disk/by-path/pci-0000:00:1f.5-ata-1
/dev/disk/by-path/pci-0000:00:1f.2-ata-2-part1
/dev/disk/by-path/pci-0000:00:1f.2-ata-2

In turn, this could then be used to create nice default graph names:

bastion-boot-lun-1
monitor-boot-lun-1-part1
monitor-boot-lun-1-part5
monitor-boot-lun-1-part2
monitor-boot-lun-1
mail-boot-lun-2-part1
mail-boot-lun-2-part5
mail-boot-lun-2-part2
mail-boot-lun-2
1f.2-ata-1
1f.5-ata-1-part1
1f.5-ata-1
1f.2-ata-2-part1
1f.2-ata-2
quotengrote commented 2 years ago

You are right, it dont have to be by-id, as long as it is persistent.

niclan commented 2 years ago

Hi,

This goal is very praiseworthy. There was a brief discussion about it many years ago and no-on wanted to write a patch as I recall it.

/dev/disk/by-path seems not to be very persistent if a disk is moved. The UUID follows the disk.

It would seem that the preference must be configurable and that the default should stay the same as it is today.

/dev/disk/by-path# ls
pci-0000:01:00.1-ata-1        pci-0000:05:00.0-ata-10        pci-0000:08:00.2-ata-3
pci-0000:01:00.1-ata-2        pci-0000:05:00.0-ata-12        pci-0000:08:00.2-ata-4
pci-0000:01:00.1-ata-2-part1  pci-0000:05:00.0-ata-12-part1  pci-0000:08:00.2-ata-4-part1
pci-0000:01:00.1-ata-5        pci-0000:05:00.0-ata-2         pci-0000:08:00.2-ata-4-part2
pci-0000:01:00.1-ata-6        pci-0000:05:00.0-ata-5         pci-0000:08:00.2-ata-4-part3
pci-0000:05:00.0-ata-1        pci-0000:05:00.0-ata-7

By UUID has another problem though, where we before try to report only on whole disks (and get messed up by lvm) UUID makes it much harder to see what is a disk and what is a partition unless the symlinks are resolved.

dev/disk/by-uuid# ls
05ad69b3-63ab-4c54-a3b4-49cb2f268a7b  b0affcaa-7670-49a5-908f-6b1584cc9cd9
23f551dc-ee24-4e73-9a82-f25f6114eea4  b4916ac4-fb0f-439a-b7ed-57d6715d070e
4158a356-6dca-44a5-8616-cd976f221505  b8a23980-0b49-4487-9a29-32263d03058c
5c12bfc3-b993-4f08-83c4-d5dfc57d3f2c  c0cc9437-c21f-43f0-b1db-cdd67def7c62
656d2cc7-40bd-4e5b-ab5c-201e2030b758  c547cb02-b830-4662-898b-254b97bfbf05
6f54f3e9-5dc0-4dac-9cbc-7535a529faef  c7215a6f-fea0-48f9-8310-99d237cd9515
7d9d00eb-30f7-462a-8a69-1766079170e7  cae08812-0cdb-4dad-a97a-9c83b21e9b68
7DD7-4A66                             e7d55cb1-0dd4-4a1e-83bc-9b3f2e66ace3
9d848545-71ee-4493-b315-a8c8dc96d931  f35e5394-d934-489c-9c89-35f1c54b9e7a

By-id on my server has names like this:

dm-name-misc--slidescan-slidescan
dm-uuid-LVM-9SzWzr20wHxrKAgwyeaPpQD4YZ6WzcaKynjfJod47J9Zd0vQjrpgYFnBPrmc84nX
dm-uuid-LVM-FhZ0xcITC1swLGCSfl1bljFLN2841AbepgAPtybTLWgtojwVkgB2UdllwdVGz6IX
lvm-pv-uuid-3iBUyL-UvsK-yoUd-x9YD-3JXL-HdIe-88Yo4I
lvm-pv-uuid-BNLxDf-c9TB-CKxz-Rd7m-Uh3l-7ATC-67cE8A
lvm-pv-uuid-CbUYkM-9WY0-vNem-zvYj-QXgK-P4J8-7HOsbQ
scsi-0ATA_ST8000AS0002-1NA_Z840P4QR
scsi-0ATA_ST8000AS0002-1NA_Z84122PP
scsi-0ATA_TOSHIBA_HDWG180_4130A0DWFAUG
scsi-0ATA_TOSHIBA_HDWG180_4130A0DWFAUG-part1
wwn-0x5000039ad8c80c74
wwn-0x5000039ad8c80c74-part1
wwn-0x5000039ad8c80c74-part2

Which makes it quite easy to make heuristics about what devices are disks, partiitons, lvm and so on. And from what I can tell the names identify disks in a way that is independent on hardware paths. ... In part, I see that the "scsi-0ATA" like bits will have to be chopped off since they denote hardware paths. ... But I have to guess they're there because manufacturer/model/serial which makes up the rest might not be all that reliable?

niclan commented 2 years ago

Hmm. Also notable is that on my server "by-uuid" only has 18 entries mostly pointing to "dm-*" devices. While "by-id" has 140 entries. It seems we'll miss quite a lot of disks by using "by-uuid" on random hosts.

quotengrote commented 2 years ago

Is there a need to distinguished between disks, lvm, and such? Because a lvm-volume, or partition dont have a temperature or smart values. Would it not be enough to just get the disks? Or did i miss something here?

sumpfralle commented 2 years ago

Hmm. Also notable is that on my server "by-uuid" only has 18 entries mostly pointing to "dm-*" devices. While "by-id" has 140 entries. It seems we'll miss quite a lot of disks by using "by-uuid" on random hosts.

Here on my laptop I see that the followings disks listed only in /etc/disk/by-id (and missing in /etc/disk/by-uuid):

The only disks I noticed to be missing in /etc/disk/by-id are loop devices.

Thus I guess, we could use the content of by-uuid for the complete list of relevant devices (the munin field name can be based on their UUID). The tricky thing would probably to determine a human readable name for these devices (as a munin field label).

niclan commented 2 years ago

I should note that the the by-uuid directory on my server points to two partitions on it's boot disk and the rest point to dm devices that are LVM logical volumes. This is because I tend to put the LVM PV label on the main device of the disk, i.e. not do any partitioning so the disk itself has no uuid. Except for the boot disk.

dbalnaves commented 2 years ago

I think I overlooked this was specifically for smartctl; what about something like this?

$ for i in `ls -d /sys/devices/pci*/*/*/*/*/*/model | sed -e 's/model$//'`; do echo `ls $i/block` `cat $i/model`; done
sda WDC WD20EARX-00P
sr0 DVD-RAM GH80N
sdb WDC WD20EARS-00M
sdc SAMSUNG HD103SJ
niclan commented 2 years ago

To me it looks like "by-id" is the only place to find disk devices reliably in /proc since not all physical devices will have a uuid.

My laptop:

# for i in `ls -d /sys/devices/pci*/*/*/*/*/*/model | sed -e 's/model$//'`; do echo `ls $i/block` `cat $i/model`; done
ls: cannot access '/sys/devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.0/media0//block': No such file or directory
Integrated Camera: Integrated C

My server:

# for i in `ls -d /sys/devices/pci*/*/*/*/*/*/model | sed -e 's/model$//'`; do echo `ls $i/block` `cat $i/model`; done
ls: cannot access '/sys/devices/pci0000:00/0000:00:03.1/0000:06:00.1/driver/module/parameters//block': No such file or directory
(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null)
ls: cannot access '/sys/devices/pci0000:00/0000:00:08.1/0000:08:00.3/driver/module/parameters//block': No such file or directory
(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null)

There is also smartctl --scan (but doing a actual SCSI bus scan on a old style SCSI bus was not terribly fast and could lock up if a device was faulty, and very slow if there was a scanner on the bus) and then smartctl -i $device: E.g.:

# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/sde -d scsi # /dev/sde, SCSI device
/dev/sdf -d scsi # /dev/sdf, SCSI device
/dev/sdg -d scsi # /dev/sdg, SCSI device
/dev/sdh -d scsi # /dev/sdh, SCSI device
/dev/sdi -d scsi # /dev/sdi, SCSI device
/dev/sdj -d scsi # /dev/sdj, SCSI device
/dev/sdk -d scsi # /dev/sdk, SCSI device
#  smartctl -i /dev/sdj
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-110-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD80EFZX-68UW8N0
Serial Number:    R6GWJUHY
LU WWN Device Id: 5 000cca 263cc852b
Firmware Version: 83.H0A83
User Capacity:    8 001 563 222 016 bytes [8,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue May 31 09:36:20 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

All the (rotating) disks in my server have the "LU WWN Device Id" (a.k.a. "World Wide Name" from SAN) which should be unique like MAC addresses on ethernet. But the NVME on my laptop:

# smartctl -i /dev/nvme0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-33-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       SAMSUNG MZVLW256HEHP-000L7
Serial Number:                      S35ENX0J441783
Firmware Version:                   4L7QCXB7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 256 060 514 304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      2
NVMe Version:                       1.2
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256 060 514 304 [256 GB]
Namespace 1 Utilization:            199 481 708 544 [199 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 b471b73e77
Local Time is:                      Tue May 31 09:37:02 2022 CEST
dbalnaves commented 2 years ago

I also have PV straight on the disk (without a partition), and I've noticed they don't show in by-uuid either:

# ls /dev/disk/by-uuid/ -la
total 0
drwxr-xr-x 2 root root 180 May 24 17:02 .
drwxr-xr-x 7 root root 140 May 24 17:02 ..
lrwxrwxrwx 1 root root  10 May 24 17:02 284a1696-2fe9-4516-bda4-c15abf434492 -> ../../dm-1
lrwxrwxrwx 1 root root  10 May 24 17:02 2c66a4e9-8285-4f14-bd0a-f95116b90733 -> ../../dm-4
lrwxrwxrwx 1 root root  10 May 24 17:02 476f3a27-e929-4ed8-904d-27e12508471f -> ../../dm-3
lrwxrwxrwx 1 root root  10 May 24 17:02 67221888-1cfd-4973-ad81-830f2e3fd95f -> ../../sde1
lrwxrwxrwx 1 root root  10 May 24 17:02 975ecaa1-e149-4282-aa3d-95f6843cc408 -> ../../dm-0
lrwxrwxrwx 1 root root  10 May 24 17:02 b8201c4c-121b-455b-8ab7-62577bb8db10 -> ../../dm-2
lrwxrwxrwx 1 root root  10 May 24 17:02 e76873b1-af41-4afe-8f88-37a530828ff9 -> ../../sdd1

However, I have noticed they do show in by-id:

# ls /dev/disk/by-id/ -la | grep -v [0-9]$
drwxr-xr-x 2 root root 1260 May 24 17:02 .
drwxr-xr-x 7 root root  140 May 24 17:02 ..
lrwxrwxrwx 1 root root    9 May 24 18:01 ata-SAMSUNG_HD103SJ_S26BJ9AB100242 -> ../../sdc
lrwxrwxrwx 1 root root    9 May 24 17:02 ata-WDC_WD20EARS-00MVWB0_WD-WMAZA3883998 -> ../../sdb
lrwxrwxrwx 1 root root    9 May 24 17:02 ata-WDC_WD20EARX-00PASB0_WD-WCAZAD862886 -> ../../sda
lrwxrwxrwx 1 root root    9 May 24 17:02 lvm-pv-uuid-qJvNPO-tNY1-lMnP-rIyS-8fWg-KHzB-iPCo5r -> ../../sda
lrwxrwxrwx 1 root root    9 May 24 17:02 scsi-36001405330ce88edc9b7d41dcdb8aedf -> ../../sdf
lrwxrwxrwx 1 root root    9 May 24 17:02 scsi-360014053ceaa78fdae54d31dfd87a1d1 -> ../../sde
lrwxrwxrwx 1 root root    9 May 24 17:02 scsi-36001405aae38decd86b4d3dc2dacd4d5 -> ../../sdd
lrwxrwxrwx 1 root root    9 May 24 17:02 wwn-0x50014ee206a1d229 -> ../../sda
lrwxrwxrwx 1 root root    9 May 24 17:02 wwn-0x50014ee6ab828a38 -> ../../sdb
lrwxrwxrwx 1 root root    9 May 24 18:01 wwn-0x50024e92042a11ad -> ../../sdc
lrwxrwxrwx 1 root root    9 May 24 17:02 wwn-0x6001405330ce88edc9b7d41dcdb8aedf -> ../../sdf
lrwxrwxrwx 1 root root    9 May 24 17:02 wwn-0x60014053ceaa78fdae54d31dfd87a1d1 -> ../../sde
lrwxrwxrwx 1 root root    9 May 24 17:02 wwn-0x6001405aae38decd86b4d3dc2dacd4d5 -> ../../sdd

The thing I don't like is that iSCSI shows as scsi which can be confused with physical disks. Also think that pointing to a partition is conceptionally undesirable.

niclan commented 2 years ago

The problem with iSCSI (and possebly ATA-over-ethernet) we would have no matter what id scheme is selected I think. How does smartctl react to iSCSI devices? (I don't have iSCSI anyhere, neither work nor home).

In by-id on my server I see this:

scsi-SATA_WDC_WD80EFZX-68U_R6GWJUHY
scsi-SATA_WDC_WD80EFZX-68U_R6GWJUHY-part1
wwn-0x5000039ad8c80c74
wwn-0x5000039ad8c80c74-part1

so heuristics to eliminate partitions seems rather easy.

dbalnaves commented 2 years ago

I use iSCSI to back resilient LibVirt disks (which I've encountered professionally with NetApp and virtual clusters); While I do not endorse Synology in any way (DSM is reckless and terrible, I'm a sucker for their HW), I'll help answer that:

# # smartctl -i /dev/disk/by-path/ip-192.168.1.15:3260-iscsi-iqn.2000-01.com.synology:bastion-boot-lun-1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-96-lowlatency] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SYNOLOGY
Product:              iSCSI Storage
Revision:             4.0
Compliance:           SPC-3
User Capacity:        107,374,182,400 bytes [107 GB]
Logical block size:   512 bytes
LU is thin provisioned, LBPRZ=0
Logical Unit id:      0x6001405330ce88edc9b7d41dcdb8aedf
Serial number:        330ce88e-c9b7-41dc-b8ae-f282ec11a38e
Device type:          disk
Transport protocol:   iSCSI
Local Time is:        Tue May 31 22:49:00 2022 AEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

I'd also be very curious if anyone has ATAoE output to share!

I think the aim here should be to target local physical disks - most aggregated infrastructure has a their own methods for handling SMART. I think /sys could work if it can be standardized across kernel versions and architectures but otherwise would prove authoritative.

As a side note on LVM, not all SATA ports are equal. To track down performance problems, I use this in my lvm.conf:

global_filter = [ "a|pci-0000:00:1f|", "r|.*|" ]

This in turn produces elegant output like:

# pvs
  PV                                             VG      Fmt  Attr PSize    PFree
  /dev/disk/by-path/pci-0000:00:1f.2-ata-1       vmdisks lvm2 a--    <1.82t 1.35t
  /dev/disk/by-path/pci-0000:00:1f.2-ata-2-part1 vmhost  lvm2 a--    <1.82t    0
  /dev/disk/by-path/pci-0000:00:1f.5-ata-1-part1 vmhost  lvm2 a--  <931.51g    0

LVM already knows to target /dev/disk, but in line with my opinion I should have made it by-id to be more useful with the case off.

sumpfralle commented 2 years ago

What a nice and lively discussion about plugin details :)

@niclan:

All the (rotating) disks in my server have the "LU WWN Device Id" (a.k.a. "World Wide Name" from SAN) which should be unique like MAC addresses on ethernet. But the NVME on my laptop: [..]

I never stumbled upon the WWN, but it feels like a suitable approach this plugin to me. Could you check, whether your NVME (missing the WWN in smartctl's output) is listed below /dev/disk/by-id with a wwn- prefix?

If yes, then maybe something in line with the following filter would be a good start for the plugin?

ls /dev/disk/by-id/ | grep "^wwn-0x" | grep -v -- "-part[0-9]*$"
dbalnaves commented 2 years ago

I guess the only reservations I have using the WWN from by-id is that it still picks up iSCSI devices. Maybe some cases with iSCSI (and possibly ATAoE) this could pay off, but in the majority of cases where LUNs are an aggregation of disks I can't see SMART providing anything useful.

niclan commented 2 years ago

All of these will pick up iSCSI devices except possibly the by-path scheme which might have node names like ip-192.168.1.15 as shown above. But the path identifies the access path, not the device. I don't have any iSCSI experience so I'll be tentative here: iSCSI might also be uncovered by a smartctl check which above shows the device as Product: iSCSI Storage and Transport protocol: iSCSI and also interesting is the Temperature Warning: Disabled or Not Supported

The plugin can persist some information on each device node with the provided module SDK in perl in Munin::Plugin (not in the shell API I see, and munin does not provide python or ruby plugin SDKs, though they're not complex) to consistently eliminate disks it thinks are iSCSI or have temperature warnings disabled or unsupported.

I see that NVME is given a wwn device node. And also all the rotating disks on my large and small server have wwn device nodes. My oldest disk is a 500GB disk which must be around 12 years old now.

But I think this scheme should be opt-in, and to keep the present scheme as default. Or write a new plugin with a new basename which depreciates the old one. That way we can keep the old plugin for old installs but make it not auto configure on new installs. Then the new plugin would gradually replace the old one.

I've dreamt of providing more disk information via the .extinfo field, if smartctl -i is run on all newly discovered devices some of that might be easily persisted by the plugin and provided in .extinfo.

I've asked a friend for smartctl -i for some iSCSI devices he has.

niclan commented 2 years ago

Here is smartctl output for devices from a different iSCSI stack:

$ sudo smartctl -i /dev/sdk
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, [www.smartmontools.org](http://www.smartmontools.org/)

=== START OF INFORMATION SECTION ===
Vendor:               DGC
Product:              VRAID
Revision:             5003
Compliance:           SPC-4
User Capacity:        53,687,091,200 bytes [53.6 GB]
Logical block size:   512 bytes
LU is thin provisioned, LBPRZ=1
Logical Unit id:      0x6006016064414800bae22f5e762cdf11
Serial number:        CKM00180600562
Device type:          disk
Local Time is:        Thu Jun  2 11:06:18 2022 CEST
SMART support is:     Unavailable - device lacks SMART capability.

$ sudo smartctl -i /dev/sdz
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, [www.smartmontools.org](http://www.smartmontools.org/)

=== START OF INFORMATION SECTION ===
Vendor:               DGC
Product:              VRAID
Revision:             5003
Compliance:           SPC-4
User Capacity:        53,687,091,200 bytes [53.6 GB]
Logical block size:   512 bytes
LU is thin provisioned, LBPRZ=1
Logical Unit id:      0x6006016064414800c8862a5e09026916
Serial number:        CKM00180600562
Device type:          disk
Local Time is:        Thu Jun  2 11:06:23 2022 CEST
SMART support is:     Unavailable - device lacks SMART capability.

Not easy to see that this is iSCSI but the smart support line excludes the devices neatly.

quotengrote commented 2 years ago

What are the next steps, how should we procede?

niclan commented 2 years ago

I hope someone-not-me will write a new version. I can test it on all my hosts and some at work and I can debug and write perl, python, ruby, shell and probably manage debugging of a few other languages.

quotengrote commented 2 years ago

I can try to write a patch.

quotengrote commented 2 years ago

First try, without testing...

https://github.com/munin-monitoring/munin/compare/master...quotengrote:not_sd

niclan commented 2 years ago

Grep that way is an external command and it's not connected to the readdir call. This should work better:

@drivesSCSI = grep(!/^\.$|^\.\.$|wwn|lvm|dm-|-part/, readdir SCSI);

but the shell code showed previously in https://github.com/munin-monitoring/munin/issues/1472#issuecomment-1143482986 is more like this:

@drivesSCSI = grep { /^wwn/ && !/-part/ } readdir SCSI;

But since these are main plugins and not contrib we need a soft way to preserve the old functionality. And introduce the new functionality in new installs. This can be done in steps:

  1. Mark the old plugins #%# family=manual. This way we can continue distributing the old plugins and preserve peoples histories
  2. Make new plugins with the new label functionality - and maybe some more new, and mark them as auto. New installs will use these automatically. Upgraded installs that runs the autoinstall process will also get them (in addition to the old ones)
  3. When munin 3 (or 2.higher) comes around remove the old plugins

Above there was some lobbying for using by-path as well, this can quite easily be supported in a new plugin as an option.

But I rant and dream and I'm not quite up to this myself right now.

niclan commented 2 years ago

Returns a array like this:

$VAR1 = [
          'wwn-0x5000cca3b7c60f48',
          'wwn-0x5000039ad8c80c74',
          'wwn-0x5000cca263cc852b',
          'wwn-0x50014ee2b2ed092c',
          'wwn-0x50014ee2b4b26da9',
          'wwn-0x50014ee25f5cbae4',
          'wwn-0x5000c500a367a3ef',
          'wwn-0x5000c500924e06ce',
          'wwn-0x5000039b38d17d62',
          'wwn-0x5000cca099c238d5',
          'wwn-0x5000039b38d177a0'
        ];
Jorilx commented 12 hours ago

@niclan Are there updates on this issue? Can I help?