petersulyok / smfc

Super Micro Fan Control
GNU General Public License v3.0
174 stars 14 forks source link

[Feature request] Add support to Supermicro M12SWA-TF #22

Open pktiuk opened 1 year ago

pktiuk commented 1 year ago

I see there are only X10/X11 motherboards.
Would it need a lot of effort to implement support for motherboard M12SWA-TF ?

petersulyok commented 1 year ago

There is a compatibility feedback here from @staaled on a Supermicro H13SSL-NT motherboard where he managed to configure smfc.

@staaled: How did you configure the CPU zone for AMD in order to read the temperature properly? Could you please share your config?

staaled commented 1 year ago

Sorry for the late response @petersulyok

Well... to expand a little on what I wrote at https://github.com/petersulyok/smfc/issues/19#issuecomment-1593015583

I use the k10temp kernel module for AMD CPUs instead of coretemp for Intel CPUs, and the rest is guesswork.


When running sensors (from lm-sensors):

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +39.0 C  
Tccd1:        +33.0 C  
Tccd2:        +32.9 C  
Tccd3:        +34.8 C  
Tccd4:        +34.4 C  

So just did a quick and dirty search for a hwmon temp1_label file containing Tctl in sysfs: /sys/bus/pci/drivers/k10temp/0000:00:18.3/hwmon/hwmon13/temp1_label

and chucked this into smfc.conf under the CPU zone section: hwmon_path=/sys/bus/pci/drivers/k10temp/0000*/hwmon/hwmon*/temp1_input

When running smfc this seems to expand properly and it matches the temperature reading from sensors and ipmi:

# systemctl status smfc -n 50 | grep hwmon
Jun 28 06:28:01 localhost smfc.service[9451]:    hwmon_path = ['/sys/bus/pci/drivers/k10temp/0000:00:18.3/hwmon/hwmon13/temp1_input']
# 

Please note this is only tested on a single socket EPYC Zen4(Genoa) CPU running a 6.2.0 kernel.

One observation i made is that hwmon13 is NOT stable, and may vary between reboots/changes to components etc, so I wouldn't recommend using anything like hwmon_path=/sys/class/hwmon/hwmon13/temp1_input

Full config I'm experimenting with now:

[Ipmi]
command=/usr/bin/ipmitool 
fan_mode_delay=10
fan_level_delay=5
swapped_zones=1

[CPU zone]
enabled=1
count=1
temp_calc=1
steps=6
sensitivity=3.0
polling=2
min_temp=35.0
max_temp=70.0
min_level=10
max_level=100
hwmon_path=/sys/bus/pci/drivers/k10temp/0000*/hwmon/hwmon*/temp1_input

[HD zone]
enabled=0

FWIW I replaced my chassis fans with Noctua NF-A9x14's, and disabled the HD zone because I want those silent puppies running full speed (~2100 RPM), as the stock fan on the Dynatron J12 CPU cooler is a little 80mm monster which does 8000 RPM at full tilt and makes me wonder if I can use it as a siren for the burglar alarm... The min_level setting in the above config may not be very safe though.

petersulyok commented 1 year ago

@staaled thanks for sharing this!

I'm planning to add support of AMD CPUs for smfc as well and I would have some further questions:

I really appreciate your help.

staaled commented 1 year ago

We should perhaps create a separate issue for this, however just a quick response to your questions @petersulyok:

As a quick sidenote, fan_measurement.sh requires a lot longer delay between changing fan levels to pick up the actual change, in the range of 10-15 seconds, or in my case they are still speeding up or slowing down when the measurement is taken, a nice feature would also be to detect when it trips the lowct point and fans spin up to 100% automatically.

staaled commented 1 year ago

@petersulyok :

So a friend of mine has a dual socket SuperMicro H12 motherboard with 2x EPYC 7551, 5.4 kernel, that outputs:

ls -al /sys/module/k10temp/drivers/pci:k10temp/
total 0
drwxr-xr-x  2 root root    0 Feb 12 20:08 .
drwxr-xr-x 30 root root    0 Feb 12 20:08 ..
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:18.3 -> ../../../../devices/pci0000:00/0000:00:18.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:19.3 -> ../../../../devices/pci0000:00/0000:00:19.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1a.3 -> ../../../../devices/pci0000:00/0000:00:1a.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1b.3 -> ../../../../devices/pci0000:00/0000:00:1b.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1c.3 -> ../../../../devices/pci0000:00/0000:00:1c.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1d.3 -> ../../../../devices/pci0000:00/0000:00:1d.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1e.3 -> ../../../../devices/pci0000:00/0000:00:1e.3
lrwxrwxrwx  1 root root    0 Jul  3 12:08 0000:00:1f.3 -> ../../../../devices/pci0000:00/0000:00:1f.3
--w-------  1 root root 4096 Jul  3 12:08 bind
lrwxrwxrwx  1 root root    0 Jul  3 12:08 module -> ../../../../module/k10temp
--w-------  1 root root 4096 Jul  3 12:08 new_id
--w-------  1 root root 4096 Jul  3 12:08 remove_id
--w-------  1 root root 4096 Feb 12 20:08 uevent
--w-------  1 root root 4096 Jul  3 12:08 unbind

Apparently they all have a Tctl sensor, but no Tccd's


I have another single socket EPYC 7302 on a SuperMicro H12SSL-NT board running 5.15 kernel:

ls -al /sys/module/k10temp/drivers/pci:k10temp/
total 0
drwxr-xr-x  2 root root    0 May  1  2022 .
drwxr-xr-x 34 root root    0 May  1  2022 ..
lrwxrwxrwx  1 root root    0 Jul  3 13:16 0000:00:18.3 -> ../../../../devices/pci0000:00/0000:00:18.3
--w-------  1 root root 4096 Jul  3 13:16 bind
lrwxrwxrwx  1 root root    0 Jul  3 13:16 module -> ../../../../module/k10temp
--w-------  1 root root 4096 Jul  3 13:16 new_id
--w-------  1 root root 4096 Jul  3 13:16 remove_id
--w-------  1 root root 4096 May  1  2022 uevent
--w-------  1 root root 4096 Jul  3 13:16 unbind

With standard sensors output:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +42.5 C  
Tccd1:        +39.8 C  
Tccd3:        +40.0 C  
Tccd5:        +42.2 C  
Tccd7:        +39.8 C  
petersulyok commented 1 year ago

Hi @pktiuk, did you manage to setup your system based on the sample here? I would appreciate to hear your feedback.

pktiuk commented 1 year ago

I haven't done this yet.
Unluckily I don't have too much time in this month for setting this up. But I will keep in mind testing this.

petersulyok commented 1 year ago

Let me know if you need some further help. The documentation of the latest v3.0.0 version contains recommendation for AMD users.