Closed b-reich closed 5 months ago
Can you check that whether disk6
appears in /sys/class/leds
? If not, the issue is in the module.
Regarding the script, it is unfortunate to hear that the hctl mapping does not work on your model. For the disk6
issue, it may be because the script detects and creates the disk map at startup, so it can only handle disk removal. I may rewrite it to better handle the disk insertion event when I find some time.
If the issue lies in the problem mentioned above, a workaround could be adding a udev rule to restart the systemd service / the script when a disk removal or insertion event occurs.
maybe mapping by/dev/disk/by-path/
is a thing?
@miskcoo the kernel modules itself works.
Disk6 doenst work only with serial
and hctl
is totaly random everytime
maybe mapping by
/dev/disk/by-path/
is a thing?
That seems work and is equivalent to HTCL in my model. How about yours? Would it be stable? I guess using the alphabetical order of these paths would be a possible solution.
# /dev/disks/by-path
lrwxrwxrwx 1 root root 9 Jun 9 16:41 pci-0000:00:17.0-ata-1 -> ../../sda
lrwxrwxrwx 1 root root 9 Jun 9 16:41 pci-0000:00:17.0-ata-1.0 -> ../../sda
lrwxrwxrwx 1 root root 9 Jun 10 18:59 pci-0000:00:17.0-ata-2 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jun 10 18:59 pci-0000:00:17.0-ata-2.0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jun 9 16:20 pci-0000:01:00.0-ata-1 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jun 9 16:20 pci-0000:01:00.0-ata-1.0 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jun 15 15:23 pci-0000:01:00.0-ata-2 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jun 15 15:23 pci-0000:01:00.0-ata-2.0 -> ../../sdb
# output of lsblk
NAME HCTL
sda 0:0:0:0
sdb 3:0:0:0
sdc 1:0:0:0
sdd 2:0:0:0
For the disk6 issue in the serial mapping mode, can you apply this patch and run the script ugreen-diskiomon
manually to observe the output?
diff --git a/scripts/ugreen-diskiomon b/scripts/ugreen-diskiomon
index 1c00117..76152ec 100755
--- a/scripts/ugreen-diskiomon
+++ b/scripts/ugreen-diskiomon
@@ -43,6 +43,7 @@ done <<< "$(lsblk -S -o name,${mapping_method} | tail -n +2)"
for i in "${!led_map[@]}"; do
led=${led_map[i]}
if [[ -d /sys/class/leds/$led ]]; then
+ echo "set ${led} to the oneshot mode"
echo oneshot > /sys/class/leds/$led/trigger
echo 1 > /sys/class/leds/$led/invert
echo 100 > /sys/class/leds/$led/delay_on
@@ -57,6 +58,7 @@ for i in "${!led_map[@]}"; do
devices[$led]=${dev}
else
# turn off the led if no disk installed on this slot
+ echo "${dev} not found, turn off ${led}"
echo 0 > /sys/class/leds/$led/brightness
echo none > /sys/class/leds/$led/trigger
fi
root@r5d4:~# ls -lah /dev/disk/by-path/
total 0
drwxr-xr-x 2 root root 440 Jun 14 21:13 .
drwxr-xr-x 8 root root 160 Jun 14 21:02 ..
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:00:17.0-ata-1 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:00:17.0-ata-1.0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:00:17.0-ata-2 -> ../../sda
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:00:17.0-ata-2.0 -> ../../sda
lrwxrwxrwx 1 root root 13 Jun 14 21:13 pci-0000:01:00.0-nvme-1 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:01:00.0-nvme-1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:01:00.0-nvme-1-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:01:00.0-nvme-1-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 13 Jun 14 21:13 pci-0000:56:00.0-nvme-1 -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:56:00.0-nvme-1-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:56:00.0-nvme-1-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Jun 14 21:13 pci-0000:56:00.0-nvme-1-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-1 -> ../../sdf
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-1.0 -> ../../sdf
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-2 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-2.0 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-3 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-3.0 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-4 -> ../../sde
lrwxrwxrwx 1 root root 9 Jun 14 21:13 pci-0000:59:00.0-ata-4.0 -> ../../sde
root@r5d4:~# lsblk -S -o name,hctl
NAME HCTL
sda 1:0:0:0
sdb 3:0:0:0
sdc 0:0:0:0
sdd 4:0:0:0
sde 5:0:0:0
sdf 2:0:0:0
It seems they are also the same?
not sure, if i understand the arch wiki correct https://wiki.archlinux.org/title/persistent_block_device_naming#by-id_and_by-path it should be the correct mapping between the leds and disks.
I have some spare time later, I can give it a try
I do not see anything that has not been said before. The mapping of sdX to physical devices (as denoted by /dev/disk/by-path or HCTL) is indeed random - that is why we do not rely on sdX enumeration for locating the disks.
In your cited example, we have:
0:0:0:0 = pci-0000:00:17.0-ata-1 = sdc = disk1
1:0:0:0 = pci-0000:00:17.0-ata-2 = sda = disk2
2:0:0:0 = pci-0000:59:00.0-ata-1 = sdf = disk3
3:0:0:0 = pci-0000:59:00.0-ata-2 = sdb = disk4
4:0:0:0 = pci-0000:59:00.0-ata-3 = sdd = disk5
5:0:0:0 = pci-0000:59:00.0-ata-4 = sde = disk6
So what is wrong, actually with using HCTL for location?
What I care more about is what happens when you remove a disk in the middle, say, disk4. If in that case, the list above was missing 3:0:0:0, everything is fine. If only 5:0:0:0 was missing, we would be back to square one.
Your serial of disk 6 seems 1 letter longer than other disks. Is that a typo in your post or in your script?
Your serial of disk 6 seems 1 letter longer than other disks. Is that a typo in your post or in your script?
u are right...
there was a 5
from placeholder5
left
damit
I do not see anything that has not been said before. The mapping of sdX to physical devices (as denoted by /dev/disk/by-path or HCTL) is indeed random - that is why we do not rely on sdX enumeration for locating the disks.
In your cited example, we have:
0:0:0:0 = pci-0000:00:17.0-ata-1 = sdc = disk1 1:0:0:0 = pci-0000:00:17.0-ata-2 = sda = disk2 2:0:0:0 = pci-0000:59:00.0-ata-1 = sdf = disk3 3:0:0:0 = pci-0000:59:00.0-ata-2 = sdb = disk4 4:0:0:0 = pci-0000:59:00.0-ata-3 = sdd = disk5 5:0:0:0 = pci-0000:59:00.0-ata-4 = sde = disk6
So what is wrong, actually with using HCTL for location?
What I care more about is what happens when you remove a disk in the middle, say, disk4. If in that case, the list above was missing 3:0:0:0, everything is fine. If only 5:0:0:0 was missing, we would be back to square one.
Disk6 is actual sda at the moment. sda == Disk6 sdb == Disk2 sdc == Disk5 sdd == Disk3 sde == Disk4 sdf == Disk1
serial seems to work now
As I said: sdX is random with every reboot - what would be more interesting is if the slot numbers did not correlate with the HCTL order or are - as you claim - random, too (which I doubt).
Iff that is the case, we would have to reorder the HCTL values in hctl_map to fix it - BUT I have just verified for my DXP8800, the HCTL order does in fact correlate with the order of the disk slots and the by-path correlates with HCTL:
# /dev/disk/by-path
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:59:00.0-ata-1 -> ../../sda
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:59:00.0-ata-2 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:59:00.0-ata-3 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:59:00.0-ata-4 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:5a:00.0-ata-1 -> ../../sde
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:5a:00.0-ata-2 -> ../../sdf
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:5a:00.0-ata-3 -> ../../sdg
lrwxrwxrwx 1 root root 9 Jun 15 14:32 pci-0000:5a:00.0-ata-4 -> ../../sdh
# lsblk -S -x hctl -o name,hctl,serial
NAME HCTL SERIAL
sda 0:0:0:0 XXKERENY
sdc 1:0:0:0 XXKG2BXY
sdb 2:0:0:0 XXGMU62K
sdd 3:0:0:0 XXKJEZBY
sde 4:0:0:0 XXKJHBYY
sdf 5:0:0:0 XXGT2Z9L
sdg 6:0:0:0 XXKH3SGZ
sdh 7:0:0:0 XXJDB19L
As you can see, sdb and sdc are out of order (i.e. they can be random), but HCTL is not - plus, I have verified the slot order by looking at each serial. I think @miskcoo did the same for his DXP4800.
In order to check if the DXP6800 is special, please show that HCTL order does not fit your slot order by showing us the output of "lsblk -S -x hctl -o name,hctl,serial" (we assume that the verified slot order of your serial numbers in the initial post is 1-6).
@miskcoo: You should probably change the comment:
# serial number, $> lsblk -S -o name,serial
to:
# serial number, $> lsblk -S -x hctl -o hctl,serial,name
Otherwise, people will just execute that and list the serial numbers in the random sdX order (which will be wrong in almost all cases).
I am fine with the serial implementation at the moment. I am currently trying to find out how UGOS solved it. Running it in a VM.
Also taking an eye on /dev/disk/by-path
. Just because it's out of order doesn't mean it's random. But admittedly it looks very much like it. And unfortunately probably true.
Thanks for your work and support! :)
I can check hctl after a fresh reboot. Running badblocks at the moment (24h left)
hctl seems to work, maybe the readme should include a step to configure the mapping?
#1st reboot
root@r5d4:~# lsblk -S -x hctl -o hctl,serial,name
HCTL SERIAL NAME
0:0:0:0 ZR5DH0xx sda
1:0:0:0 ZR5F8Gxx sdb
2:0:0:0 ZR5DG9xx sdf
3:0:0:0 ZR5F8Fxx sdc
4:0:0:0 ZR5EW2xx sdd
5:0:0:0 ZR5F8Axx sde
#2nd reboot
HCTL SERIAL NAME
0:0:0:0 ZR5DH0xx sdc
1:0:0:0 ZR5F8Gxx sdb
2:0:0:0 ZR5DG9xx sda
3:0:0:0 ZR5F8Fxx sde
4:0:0:0 ZR5EW2xx sdd
5:0:0:0 ZR5F8Axx sdf
Correct order is
ZR5DG9xx
ZR5F8Fxx
ZR5EW2xx
ZR5F8Axx
ZR5DH0xx
ZR5F8Gxx
Think we can close this issue.
O.K., so HCTL is predictable, but in the case of your DXP6800, the mapping between HCTL and slots is:
0:0:0:0 -> disk5 1:0:0:0 -> disk6 2:0:0:0 -> disk1 3:0:0:0 -> disk2 4:0:0:0 -> disk3 5:0:0:0 -> disk4
Which is unexpected and makes HCTL mapping for that device useless as-is.
Whereas with the DXP4800 it is (expected):
0:0:0:0 -> disk1 1:0:0:0 -> disk2 2:0:0:0 -> disk3 3:0:0:0 -> disk4
and with the DXP8800 it is:
0:0:0:0 -> disk1 1:0:0:0 -> disk2 2:0:0:0 -> disk3 3:0:0:0 -> disk4 4:0:0:0 -> disk5 5:0:0:0 -> disk6 6:0:0:0 -> disk7 7:0:0:0 -> disk8
So, the issue can be closed, but:
Maybe, my dropbear-initramfs mess this up (my root volume is encrypted). But thats just a random idea.
Nope, most probably not. This has to do with how the SATA controllers are connected to the PCIe bus and then to the disk slots. There are up to 4 SATA channels on each controller chip, but yours is not in this configuration:
first controller -> disk1-4 second controller -> disk5-6
but:
first controller -> disk5-6 second controller -> disk1-4
which results in the HCTL order we see.
For the DXP8800, the configuration is:
first controller -> disk1-4 second controller -> disk5-8
Interesting. I think I should mention this in the document or script.
I tried the
hctl
andserial
mapping methodshctl: changes by everytime I reboot or add/remove a disk serial: Disk6 is off all the time, and I think there is bug when one disk is missing. I test it more later.
running on Debian 12, Linux 6.1.0-21 using dkms