intel / ledmon

Enclosure LED Utilities
GNU General Public License v2.0
73 stars 47 forks source link

No function with intel board and intel NVME #238

Closed OSF-BA closed 1 month ago

OSF-BA commented 2 months ago

Description

No function with intel board and intel NVME from kernel 6.7 With kernel 6.6 everything was still OK

Steps to reproduce bug

ledctl --log-level=all locate=/dev/nvme0n1

Expected behavior

?

Actual behavior

no funktion

Environment

OS: Proxmox V8.2 Motherboard: Intel S2600WFT NVME: = INTEL SSDPE2KX020T7

Ledmon version

1.0

Ledmon logs

No response

Ledctl logs

ledctl: IPMI Error: c1 ledctl: Unable to determine Dell Server type ledctl: IPMI Error: c1 ledctl: Unable to determine Dell Server type ledctl: 10001:01:00 before: 0x3 ledctl: 10001:01:00 after: 0x3 ledctl: 10001:02:00 before: 0x3 ledctl: 10001:02:00 after: 0x3 ledctl: 10001:01:00 before: 0x3 ledctl: 10001:01:00 after: 0x3

Ledmon supported controllers

/sys/devices/pci0000:00/0000:00:17.0 (AHCI) /sys/devices/pci0000:00/0000:00:11.5 (AHCI) /sys/devices/pci0000:16/0000:16:05.5 (VMD) /sys/devices/pci0000:b2/0000:b2:05.5 (VMD)

Additional information

No response

bkucman commented 1 month ago

Hi @OSF-BA,

Thanks for your report, information about kernel versions was very useful. At this point, in my opinion, this is not a Ledmon/ledctl issue.

After kernel bisection, I found a patch that causes described issue, https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.7-rc1&id=abaaac4845a0d6f39f83cbaba4c3b46ba5f93170 This patch was added to kernel between v6.6..v6.7-rc1, but individual OS vendors could backport it to currently being developed systems, we detected this situation in RHEL 9.5.

I'm not familiar with the Proxmox distribution, if this issue occurs in the kernel provided by Proxmox OS vednor with OS, it would be best to report this issue to them, to revert patch as it is kernel issue.

Besides, reverting this patch from kernel in my setup fixes the problem, If you could confirm this on your hardware I would be very grateful. I tested it on kernel 6.10.

From my side as next step I'm going to report this to the pci linux kernel mailing list, to find the root cause and fix the problem in the kernel upstream.

Thanks, Blazej

bkucman commented 1 month ago

kernel PCI-linux mailing list thread: https://lore.kernel.org/linux-pci/20240719122253.00004b0e@linux.intel.com/T/#u

OSF-BA commented 1 month ago

Hello Blazej! Thanks for the fast information and help!

Greetings from Austria! Christian

bkucman commented 1 month ago

Hi @OSF-BA

I send a fix to kernel-pci mailing list and has been accepted, it is planned to be included in kernel 6.11, so OS vendors will be able to use it to fix the issue in own OS kernels as it will be merged into 6.11. Fix: https://lore.kernel.org/all/20240725215945.GA855755@bhelgaas/t/#u

Are you ok with closing this issue?

Regards, Blazej

OSF-BA commented 1 month ago

Hello @bkucman

Thanks for your help, please close the issue.

Greetings Christian