vmatare / thinkfan

The minimalist fan control program
GNU General Public License v3.0
541 stars 62 forks source link

Found multiple hwmons with this name "nvme" #156

Closed koutheir closed 2 years ago

koutheir commented 2 years ago

I configured my thinkfan.yaml to take into account the NVME temperatures like follows:

sensors:
  - hwmon: /sys/class/hwmon
    name: nvme
    indices: [2, 3]
    correction: [0, 0]

But running thinkfan -nv issues this error:

ERROR: /etc/thinkfan.conf:37:
    name: nvme
          ^
Found multiple hwmons with this name:  /sys/class/hwmon/hwmon2 /sys/class/hwmon/hwmon1.

And it is true that both /sys/class/hwmon/hwmon1/name and /sys/class/hwmon/hwmon2/name have the same contents, even though they correspond to the two distinct NVME slots on the mother board.

How can I solve this issue?

koutheir commented 2 years ago

How can I solve this issue?

I know my question is vague, but it's because I lack information on how this could be solved correctly. I'm ready to create a pull request if we can agree on some approach to fix it.

Should we, for example, filter hwmons by something more than names? Or may be accept multiple hwmons having the same name and handle this situation as expected instead of an error? Or maybe perform special behavior for some file system elements?

koutheir commented 2 years ago

I think a reliable solution to this issue (and I suspect other issues too), would be to use libsensors as a source of temperatures. Using libsensors, my two NVME slots are identified as nvme-pci-0200 and nvme-pci-5500.

vmatare commented 2 years ago

Yeah, I suspected this would come up eventually. I'd be super happy if you were to submit a pull request implementing sensor discovery via libsensors. My only condition is that libsensors should be an optional build-time dependency ;-)

vmatare commented 2 years ago

The config syntax could be something like this:

sensors:
  - chip: nct6795-isa-0a20
    ids: [SYSTIN, CPUTIN, "SMBUSMASTER 0"]

where SYSTIN etc. would be the libsensors names of the individual temperature inputs for the nct6795-isa-0a20

vmatare commented 2 years ago

Oh sorry, to answer your first question: Your only option to sort-of disambiguate the two would be to specify the full path to the temperature file under the hwmon: key. Of course that way you're depending on the driver load order (because of the index in the hwmon1 and hwmon2 directories), but maybe you're lucky and it's deterministic. At least it should depend on one kernel module only (so no race condition there), and that single module should have a fixed order of probing devices...

koutheir commented 2 years ago

but maybe you're lucky and it's deterministic.

Given my observations, it's not. It changes every reboot.

I'd be super happy if you were to submit a pull request implementing sensor discovery via libsensors.

I'll try to do so.

vmatare commented 2 years ago

Implemented in master and slated for 2.0