miskcoo / ugreen_leds_controller

An LED controller of UGREEN DX4600 Pro NAS, compatible with UGREEN DXP4800/6800/8800 series.
152 stars 22 forks source link

Added SMART monitoring for disks #4

Closed yooooki closed 4 months ago

yooooki commented 4 months ago

Added a feature to detect the disk failure based on smartmon (smartctl -H). Turns off the led for empty disk slots, instead of turning red.

meyergru commented 4 months ago

There is a problem: I recently found that for my device (a DXP8800), the mapping of sda..sdh to the physical slots 1..8 is NOT 1:1, but sometimes random.

See my improved shell script in my fork on how the correct mapping can be done….

yooooki commented 4 months ago

@meyergru According to https://unix.stackexchange.com/questions/658998/how-the-h-c-t-l-numbers-read-from-in-lsscsi , H in HCTL for sata devices can change after reboot. It's also not clear whether 0:0:0:0 is the first or the 4-th or the last drive, and whether this is consistent for all UGreen devices (e.g. 600 series and 800 series, plus and non-plus) since they may use different pcie to sata adapter. A better solution might be mapping led to serial number of sata devices, but that would require install the disks once a time and manually configure them in the script, which is
The original script guarantees each led is mapped to different physical drives, and I think that's enough for casual use.

meyergru commented 4 months ago

No, as I said, it does not. It is actually misleading w/r to the Position of a defective Drive. As such, it is doing more harm than good.

Iff HCTL should be changing - and I cannot Imagine how it should - my approach would not work either, but using sda..sdh is VERIFIED to not work as it should, so the problem persists.

miskcoo commented 4 months ago

Hi @yooooki, I've also encountered the same issue today, and with the new script using HCTL, the disks are correctly mapped to the LEDs in my 4600Pro. Maybe it is better to use HCTL, which is now confirmed to work under 4600Pro and 8800Plus. For other models, I do not know if these I2C commands are compatible for controlling LEDs, so maybe it is too early to consider the HCTL changing issue.

BTW, for this PR, I think you should move the counter cnt outside the for loop, and it would be good to remove the two echos of disk failure messages in the loop as their outputs will flood the log.

yooooki commented 4 months ago

Since you both confirmed HCTL is more stable, I'll follow it. What I can find online is that every one says /dev/sdX is almost as unreliable as HCTL, and the "real" solution would be by-id or by-uuid, which I think is a little bit anonying for casual users as multiple reboots must be performed if those id are not listed on the package.