petersulyok / smfc

Super Micro Fan Control
GNU General Public License v3.0
174 stars 15 forks source link

X13SAE-F compatibility #33

Closed petersulyok closed 6 months ago

petersulyok commented 7 months ago

@hcgonzalezpr reported an issue here about compatibility of X13SAE-F motherboard. This issue was created to hold discussion and investigation on this.

petersulyok commented 7 months ago

Hi @hcgonzalezpr, do you know how many zones do you have and how the fans are assigned to them?

hcgonzalezpr commented 7 months ago

Hi @hcgonzalezpr, do you know how many zones do you have and how the fans are assigned to them?

Yes just 2 zones, I used set_ipmi_fan_level.sh to identify the Fan Zone Assignment.

Screenshot Color Zone Fans Zone ID
Yellow CPU_FAN[1-2] 0
Red SYS_FAN[1-3] 1
Purple None None , static AIO Pump no PWM
mb_zones
hcgonzalezpr commented 7 months ago

Here the output from ipmitool sensor | grep FAN, the Lower Critical Error goes to the closest available value set.

Output : CPU_FAN1 | 980.000 | RPM | ok | na | 140.000 | na | na | na | na CPU_FAN2 | 980.000 | RPM | ok | na | 140.000 | na | na | na | na SYS_FAN1 | 840.000 | RPM | ok | na | 140.000 | na | na | na | na SYS_FAN2 | 840.000 | RPM | ok | na | 140.000 | na | na | na | na SYS_FAN3 | 840.000 | RPM | ok | na | 140.000 | na | na | na | na

hcgonzalezpr commented 7 months ago

Drive configuration : 6 HDD + 2 NVMEs

The output from lm-sensors sensors

drivetemp-scsi-3-0
Adapter: SCSI adapter
temp1:        +26.0°C  (low  = +10.0°C, high = +40.0°C)
                       (crit low =  +5.0°C, crit = +60.0°C)
                       (lowest = +25.0°C, highest = +27.0°C)

drivetemp-scsi-1-0
Adapter: SCSI adapter
temp1:        +29.0°C  (low  = +10.0°C, high = +40.0°C)
                       (crit low =  +5.0°C, crit = +60.0°C)
                       (lowest = +28.0°C, highest = +30.0°C)

drivetemp-scsi-5-0
Adapter: SCSI adapter
temp1:        +29.0°C  (low  =  +0.0°C, high = +65.0°C)
                       (crit low = -40.0°C, crit = +70.0°C)
                       (lowest = +28.0°C, highest = +30.0°C)

nct6798-isa-0a30
Adapter: ISA adapter
in0:                     1.01 V  (min =  +0.00 V, max =  +1.74 V)
in1:                     1.25 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                     3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                     3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                     1.84 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                     1.06 V  (min =  +0.00 V, max =  +0.00 V)
in6:                     1.06 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                     3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                     2.88 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                     1.06 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                    0.00 V  (min =  +0.00 V, max =  +0.00 V)
in11:                    1.81 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                    1.05 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                    1.11 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                    1.81 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     0 RPM  (min =    0 RPM)
fan2:                     0 RPM  (min =    0 RPM)
fan3:                     0 RPM  (min =    0 RPM)
fan4:                     0 RPM  (min =    0 RPM)
fan5:                     0 RPM  (min =    0 RPM)
fan7:                     0 RPM  (min =    0 RPM)
SYSTIN:                 +29.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermistor
CPUTIN:                +127.5°C  (high = +80.0°C, hyst = +75.0°C)  ALARM
                                 (crit = +100.0°C)  sensor = CPU diode
AUXTIN0:                +23.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN1:               -128.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN2:                -24.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN3:                -23.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN4:                +22.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C
PCH_CHIP_TEMP:           +0.0°C
PCH_CPU_TEMP:            +0.0°C
PCH_MCH_TEMP:            +0.0°C
Agent0 Dimm0:            +0.0°C
intrusion0:            ALARM
intrusion1:            ALARM
beep_enable:           disabled

nvme-pci-0200
Adapter: PCI adapter
Composite:    +25.9°C  (low  =  -0.1°C, high = +76.8°C)
                       (crit = +79.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +27.8°C

drivetemp-scsi-4-0
Adapter: SCSI adapter
temp1:        +29.0°C  (low  =  +0.0°C, high = +65.0°C)
                       (crit low = -40.0°C, crit = +70.0°C)
                       (lowest = +28.0°C, highest = +30.0°C)

drivetemp-scsi-2-0
Adapter: SCSI adapter
temp1:        +28.0°C  (low  = +10.0°C, high = +40.0°C)
                       (crit low =  +5.0°C, crit = +60.0°C)
                       (lowest = +27.0°C, highest = +29.0°C)

drivetemp-scsi-0-0
Adapter: SCSI adapter
temp1:        +28.0°C  (low  = +10.0°C, high = +40.0°C)
                       (crit low =  +5.0°C, crit = +60.0°C)
                       (lowest = +27.0°C, highest = +29.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +35.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:        +25.0°C  (high = +80.0°C, crit = +100.0°C)
Core 4:        +25.0°C  (high = +80.0°C, crit = +100.0°C)
Core 8:        +28.0°C  (high = +80.0°C, crit = +100.0°C)
Core 12:       +25.0°C  (high = +80.0°C, crit = +100.0°C)
Core 16:       +36.0°C  (high = +80.0°C, crit = +100.0°C)
Core 20:       +24.0°C  (high = +80.0°C, crit = +100.0°C)
Core 24:       +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 28:       +25.0°C  (high = +80.0°C, crit = +100.0°C)
Core 32:       +29.0°C  (high = +80.0°C, crit = +100.0°C)
Core 33:       +29.0°C  (high = +80.0°C, crit = +100.0°C)
Core 34:       +29.0°C  (high = +80.0°C, crit = +100.0°C)
Core 35:       +29.0°C  (high = +80.0°C, crit = +100.0°C)
Core 36:       +27.0°C  (high = +80.0°C, crit = +100.0°C)
Core 37:       +27.0°C  (high = +80.0°C, crit = +100.0°C)
Core 38:       +27.0°C  (high = +80.0°C, crit = +100.0°C)
Core 39:       +27.0°C  (high = +80.0°C, crit = +100.0°C)
Core 40:       +26.0°C  (high = +80.0°C, crit = +100.0°C)
Core 41:       +26.0°C  (high = +80.0°C, crit = +100.0°C)
Core 42:       +26.0°C  (high = +80.0°C, crit = +100.0°C)
Core 43:       +26.0°C  (high = +80.0°C, crit = +100.0°C)

nvme-pci-0400
Adapter: PCI adapter
Composite:    +23.9°C  (low  =  -0.1°C, high = +76.8°C)
                       (crit = +79.8°C)
petersulyok commented 7 months ago

Thanks for the input. Let us try to setup the sensor threshold for the fans. In script set_ipmi_threshold.sh there are build-in fan names. You may edit them or you can apply commands in terminal. For example these are the proper commands for CPU_FAN1:

ipmitool sensor thresh CPU_FAN1 lower 0 100 200
ipmitool sensor thresh CPU_FAN1 upper 1600 1700 1800

The 6 thresholds should be calculated from your fan specification. See this chapter in README for more details.

Let me know if the command are working manually.

hcgonzalezpr commented 7 months ago

I did update the hardcoded names on set_ipmi_threshold.sh and found the error and started drilling down before I made the post, for now I have the fan hardcoded at 80% so they are not running at 100% all the time.

For this board it looks it only does assertion events for Lower Critical only as that the only trigger is able to generate regardless of running the FANS at different speeds.

To me it looks like a limitation of X13 BMC controller, as its implementation is a bit rough around the edges and unfinished compared to X10, with this ipmi threshold issue, lack of bios update via the BMC, half finished HTML5 IKVM implementation working on section of its menu but not the other, even though it has the correct BMC license, but I'm going a bit off topic, I'm using the latest BMC Firmware (Ver: 01.01.25 Built: 12/05/2023)

Back to the threshold, I also manually try setting the thresh using ipmitool sensor thresh <id> <threshold> <setting>, and only lcr applies without the error shown below. Im using NF-A15 for the Sys fans and NF-A14 for the CPU fans, I also calculate the threshold looking at Noctua specs and testing them with fan_measurement.sh modified to use the X13 Fan names.

For reference I'm using ipmitool (1.8.19-2) on Arch Linux and also try the other tools listed on Arch IPMI Wiki, but none of them were able to setup anything beside the Lower Critical threshold

Output from ipmitool sensor | grep CPU_FAN1 before trying to apply the thresholds: CPU_FAN1 | 1400.000 | RPM | ok | na | 140.000 | na | na | na | na

Below the outputs as requested : ipmitool sensor thresh CPU_FAN1 lower 0 100 200

Locating sensor record 'CPU_FAN1'...
Setting sensor "CPU_FAN1" Lower Non-Recoverable threshold to 0.000
Error setting threshold: Command illegal for specified sensor or record type
Setting sensor "CPU_FAN1" Lower Critical threshold to 100.000
Setting sensor "CPU_FAN1" Lower Non-Critical threshold to 200.000
Error setting threshold: Command illegal for specified sensor or record type

ipmitool sensor thresh CPU_FAN1 upper 1600 1700 1800

Locating sensor record 'CPU_FAN1'...
Setting sensor "CPU_FAN1" Upper Non-Critical threshold to 1600.000
Error setting threshold: Command illegal for specified sensor or record type
Setting sensor "CPU_FAN1" Upper Critical threshold to 1700.000
Error setting threshold: Command illegal for specified sensor or record type
Setting sensor "CPU_FAN1" Upper Non-Recoverable threshold to 1800.000
Error setting threshold: Command illegal for specified sensor or record type

Output from ipmitool sensor | grep CPU_FAN1 after trying to apply the thresholds: CPU_FAN1 | 1400.000 | RPM | ok | na | 140.000 | na | na | na | na

petersulyok commented 7 months ago

I tend to agree with your statement on missing thresholds for AST2600 (or this motherboard), but I think smfc will still work properly because IPMI FULL_MODE and the fan level setting is working properly.

I do not know how would you use this (connected fans and and zones you would like to use), but creating a proper configuration seems to be feasible. Let me know your view.

hcgonzalezpr commented 7 months ago

In reality just wanna a better fan curve than what Supermicro offers as the original values keep throwing the Noctua fans to zero RPM causing a bunch of assertion events.

For now I'll use the zone for their original intended purpose, the SYS fans are in front of the drives and the CPU fan are driving CPU exhaust. I'll try smfc over the weekend and report back on the findings or any other changes needed to account from the slightly different naming format.

hcgonzalezpr commented 7 months ago

I just test it out and it works without any modifications to the smfc service, only the ipmi bash scripts need to be modified when testing.

petersulyok commented 7 months ago

Hi @hcgonzalezpr, not sure it you managed to run smfc, let me know about your experiences.

An additional idea came into my mind to check the threshold values for AST2600: Please execute the following command to show the attribute for a fan. For example you can find CPU_FAN1 in your output and paste it here. In my case this is:

$ ipmitool -v sensor
...
Sensor ID              : FAN1 (0x41)
 Entity ID             : 29.1
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 500 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : 0.000
 Lower Critical        : 100.000
 Lower Non-Critical    : 200.000
 Upper Non-Critical    : 1600.000
 Upper Critical        : 1700.000
 Upper Non-Recoverable : 1800.000
 Positive Hysteresis   : 100.000
 Negative Hysteresis   : 100.000
 Assertion Events      : 
 Assertions Enabled    : lcr- lnr- ucr+ unr+ 
 Deassertions Enabled  : lcr- lnr- ucr+ unr+ 
...

If you are interested, you can also check Super Micro own IPMI tool called IPMICFG. You can download it from here. It has slightly different syntax but same functionality.

hcgonzalezpr commented 7 months ago

@petersulyok I did get it to work without modifying the any code of smfc, it's been working great for a few weeks. Bellow the output as requested, I do have Intel NVME drives, but these are not detected by the IPMI, but I can still get temp using lm-sensors , I'm also running ArchLinux so I had tried Super Micro impicfg, it gives the same results as the other tools.

sudo ipmitool -v sensor

Loading IANA PEN Registry...
Running Get VSO Capabilities my_addr 0x20, transit 0, target 0
Invalid completion code received: Invalid command
Discovered IPMB address 0x0
Sensor ID              : CPU Temp (0x1)
 Entity ID             : 3.1
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 29 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 98.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : PCH Temp (0xa)
 Entity ID             : 7.1
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 52 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 90.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : System Temp (0xb)
 Entity ID             : 7.2
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 27 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 85.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : Peripheral Temp (0xc)
 Entity ID             : 7.3
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 25 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 85.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : VRM_VCORE Temp (0x10)
 Entity ID             : 7.16
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 24 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 100.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : VRMVIN_AUX Temp (0x11)
 Entity ID             : 7.17
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 23 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 100.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 2.000
 Negative Hysteresis   : 2.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : DIMMAB Temp (0xb0)
 Entity ID             : 32.0
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        : 29 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 5.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 85.000
 Upper Non-Recoverable : na
 Positive Hysteresis   : 1.000
 Negative Hysteresis   : 1.000
 Assertion Events      :
 Assertions Enabled    : lcr- ucr+
 Deassertions Enabled  : lcr- ucr+

Sensor ID              : M2_SSD1 Temp (0x8c)
 Entity ID             : 7.48
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        :  Unable to read sensor: Device Not Present

Sensor ID              : M2_SSD2 Temp (0x8d)
 Entity ID             : 7.49
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        :  Unable to read sensor: Device Not Present

Sensor ID              : M2_SSD3 Temp (0x8e)
 Entity ID             : 7.50
 Sensor Type (Threshold)  : Temperature
 Sensor Reading        :  Unable to read sensor: Device Not Present

Sensor ID              : CPU_FAN1 (0x41)
 Entity ID             : 29.1
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 420 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 140.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : na
 Upper Non-Recoverable : na
 Positive Hysteresis   : 140.000
 Negative Hysteresis   : 140.000
 Assertion Events      :
 Assertions Enabled    : lcr-
 Deassertions Enabled  : lcr-

Sensor ID              : CPU_FAN2 (0x42)
 Entity ID             : 29.2
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 420 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 140.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : na
 Upper Non-Recoverable : na
 Positive Hysteresis   : 140.000
 Negative Hysteresis   : 140.000
 Assertion Events      :
 Assertions Enabled    : lcr-
 Deassertions Enabled  : lcr-

Sensor ID              : SYS_FAN1 (0x43)
 Entity ID             : 29.3
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 1120 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 140.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : na
 Upper Non-Recoverable : na
 Positive Hysteresis   : 140.000
 Negative Hysteresis   : 140.000
 Assertion Events      :
 Assertions Enabled    : lcr-
 Deassertions Enabled  : lcr-

Sensor ID              : SYS_FAN2 (0x44)
 Entity ID             : 29.4
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 1120 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 140.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : na
 Upper Non-Recoverable : na
 Positive Hysteresis   : 140.000
 Negative Hysteresis   : 140.000
 Assertion Events      :
 Assertions Enabled    : lcr-
 Deassertions Enabled  : lcr-

Sensor ID              : SYS_FAN3 (0x45)
 Entity ID             : 29.5
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 1120 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : 140.000
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : na
 Upper Non-Recoverable : na
 Positive Hysteresis   : 140.000
 Negative Hysteresis   : 140.000
 Assertion Events      :
 Assertions Enabled    : lcr-
 Deassertions Enabled  : lcr-

Sensor ID              : 12V (0x30)
 Entity ID             : 7.32
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 12.299 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 10.283
 Lower Critical        : 10.283
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 13.307
 Upper Non-Recoverable : 13.391
 Positive Hysteresis   : 0.084
 Negative Hysteresis   : 0.084
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 5VCC (0x31)
 Entity ID             : 7.33
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 5.041 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 4.239
 Lower Critical        : 4.282
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 5.505
 Upper Non-Recoverable : 5.590
 Positive Hysteresis   : 0.042
 Negative Hysteresis   : 0.042
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 3.3VCC (0x32)
 Entity ID             : 7.34
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 3.335 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 2.818
 Lower Critical        : 2.841
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 3.664
 Upper Non-Recoverable : 3.711
 Positive Hysteresis   : 0.024
 Negative Hysteresis   : 0.024
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : VBAT (0x33)
 Entity ID             : 40.0
 Sensor Type (Discrete): Battery
 States Asserted       : Battery
                         [Presence Detected]

Sensor ID              : VDD_5_DUAL (0x34)
 Entity ID             : 7.36
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 4.977 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 4.261
 Lower Critical        : 4.261
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 5.524
 Upper Non-Recoverable : 5.567
 Positive Hysteresis   : 0.042
 Negative Hysteresis   : 0.042
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : VDD_33_DUAL (0x35)
 Entity ID             : 7.37
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 3.380 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 2.816
 Lower Critical        : 2.839
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 3.662
 Upper Non-Recoverable : 3.709
 Positive Hysteresis   : 0.024
 Negative Hysteresis   : 0.024
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.8V PCH (0x36)
 Entity ID             : 7.38
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.815 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 1.620
 Lower Critical        : 1.646
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.971
 Upper Non-Recoverable : 1.997
 Positive Hysteresis   : 0.013
 Negative Hysteresis   : 0.013
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : PVNN PCH (0x37)
 Entity ID             : 7.39
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 0.842 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 0.646
 Lower Critical        : 0.675
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 0.999
 Upper Non-Recoverable : 1.028
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.05V PCH (0x38)
 Entity ID             : 7.40
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.040 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 0.864
 Lower Critical        : 0.894
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.227
 Upper Non-Recoverable : 1.246
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 2.5V BMC (0x39)
 Entity ID             : 7.41
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 2.536 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 2.302
 Lower Critical        : 2.349
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 2.653
 Upper Non-Recoverable : 2.700
 Positive Hysteresis   : 0.023
 Negative Hysteresis   : 0.023
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.8V BMC (0x3a)
 Entity ID             : 7.42
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.813 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 1.618
 Lower Critical        : 1.644
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.956
 Upper Non-Recoverable : 1.995
 Positive Hysteresis   : 0.013
 Negative Hysteresis   : 0.013
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.2V BMC (0x3b)
 Entity ID             : 7.43
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.201 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 1.014
 Lower Critical        : 1.044
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.367
 Upper Non-Recoverable : 1.387
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.0V BMC (0x3c)
 Entity ID             : 7.44
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.017 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 0.821
 Lower Critical        : 0.840
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.164
 Upper Non-Recoverable : 1.193
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : VDimmAB (0x3d)
 Entity ID             : 7.45
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.114 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 0.928
 Lower Critical        : 0.948
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.281
 Upper Non-Recoverable : 1.300
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : P_VCCIN_AUX_CPU (0x3e)
 Entity ID             : 7.46
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.797 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 1.621
 Lower Critical        : 1.641
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.964
 Upper Non-Recoverable : 2.003
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : 1.05V CPU (0x3f)
 Entity ID             : 7.47
 Sensor Type (Threshold)  : Voltage
 Sensor Reading        : 1.049 (+/- 0) Volts
 Status                : ok
 Lower Non-Recoverable : 0.853
 Lower Critical        : 0.883
 Lower Non-Critical    : na
 Upper Non-Critical    : na
 Upper Critical        : 1.206
 Upper Non-Recoverable : 1.226
 Positive Hysteresis   : 0.010
 Negative Hysteresis   : 0.010
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

Sensor ID              : Chassis Intru (0xaa)
 Entity ID             : 23.0
 Sensor Type (Discrete): Physical Security
petersulyok commented 7 months ago

Thanks for the feedback. I think we can agree that smfc is running on AST2600 platform with minor differences in fan thresholds.

JSouthGB commented 1 month ago

I made some tweaks to fan_measurement.sh for my X13SAE-F, the output wasn't accurate due to fan name differences. This should allow for variances in fan names. I also have an X10 board to test but TrueNAS doesn't let me use ipmitool from the terminal and I haven't investigated how to work around that yet.

#!/usr/bin/env bash

#   fan_measurement.sh (C) 2021-2024, Peter Sulyok
#   This script will measure the rotation speed belongs to different IPMI fan level.
#   Results will be stored in 'fan_result.cvs'.

# This script must be executed by root.
if [ "$EUID" -ne 0 ]; then
    echo "ERROR: Please run as root"
    exit 1
fi

output_file="fan_result.csv"

# get list of fans
fan_data=$(ipmitool sdr list | grep FAN)

# double check list isn't empty
if [ -z "$fan_data" ]; then
    echo "No fan data found."
    exit 1
fi

# get actual fan names since they can differ
fan_names=($(echo "$fan_data" | awk '{print $1}'))

# write CSV header
{
    echo -n "Level"
    for fan_name in "${fan_names[@]}"; do
        echo -n ",$fan_name"
    done
    echo
} > "$output_file"

echo "IPMI fan level measurement:"

# Start measurement in 100-20 IPMI fan level interval.
for i in 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20; do
    ./set_ipmi_fan_level.sh cpu $i >/dev/null
    ./set_ipmi_fan_level.sh hd $i >/dev/null
    sleep 6

    ipmitool sdr > sensor_data.txt

    fan_speeds=()
    for fan_name in "${fan_names[@]}"; do
        speed=$(grep "$fan_name" sensor_data.txt | awk '{print $3}')
        fan_speeds+=("$speed")
    done

    # write results to CSV file
    {
        echo -n "$i"
        for speed in "${fan_speeds[@]}"; do
            echo -n ",$speed"
        done
        echo
    } >> "$output_file"

    echo "Fan level: $i% done"
done

rm sensor_data.txt

echo "Fan speeds have been written to $output_file"
petersulyok commented 4 weeks ago

Hi @JSouthGB, thanks for the suggestion. I like the automatic FAN name detection in the updated script. I'll include this in the smfc release, too.