ocerman / zenpower

Zenpower is Linux kernel driver for reading temperature, voltage(SVI2), current(SVI2) and power(SVI2) for AMD Zen family CPUs.
GNU General Public License v2.0
452 stars 45 forks source link

SVI2_C_SoC Current is 0 on threadripper 2920x #3

Closed terroreek closed 4 years ago

terroreek commented 5 years ago

zenpower reports current is 0. The SoC voltage also seems very low as well.

output from sensors;

`zenpower-pci-00c3 Adapter: PCI adapter SVI2_Core: +1.00 V SVI2_SoC: +0.01 V Tdie: +35.6°C (high = +70.0°C) Tctl: +62.6°C SVI2_P_Core: 60.27 W SVI2_P_SoC: 0.00 W SVI2_C_Core: +60.27 A SVI2_C_SoC: +0.00 A

zenpower-pci-00cb Adapter: PCI adapter SVI2_Core: +1.36 V SVI2_SoC: +0.01 V Tdie: +33.9°C (high = +70.0°C) Tctl: +60.9°C SVI2_P_Core: 14.10 W SVI2_P_SoC: 0.00 W SVI2_C_Core: +11.43 A SVI2_C_SoC: +0.00 A`

ocerman commented 5 years ago

Hello, thanks for your report. I am not sure if I will be able to fix this as do not have access to that hardware.

Can you please try if HWiNFO64 (https://www.hwinfo.com/download/) is able to report these values under Windows (if you do not have windows installed, you can use "live" version like this one https://www.hirensbootcd.org/ and Portable version of HWiNFO64).

It would be helpful if you could also send debug log and screenshot from HWiNFO64. First you have to enable Debug Mode it in "Settings", then check "Sensors only", click "Run" and let it running for a few minutes. Please make a screenshot after it starts running. After closing HWiNFO64, you should see debug log "HWiNFO64.DBG" next to "HWiNFO64.exe".

terroreek commented 5 years ago

I am working on getting this to you, looks like hirensbootcd wont boot. Gets stuck at the windows/uefi boot screen. I'll try to get this to you in the next couple of days. Then I have to travel for work, might try using windows2go to get this info to you.

terroreek commented 5 years ago

Ok I apologize, I was traveling for work. Here is the capture and the debug from HWiNFO64. Sorry, I forgot to hit sensors only. Do you need me to re-run it ?

Capture

HWiNFO64.DBG.zip

terroreek commented 5 years ago

ok here it is with sensors only

2019-06-30 00_38_37-1 2019-06-30 00_39_15-HWiNFO64 v6 08-3830 Sensor Status

HWiNFO64.DBG.zip

ocerman commented 5 years ago

@terroreek thank you. I will look into it.

ocerman commented 5 years ago

@terroreek Hello, it seems like Threadripper CPU's are not reporting SOC values. If you are still around, can you please update to new version and post debug data to #12 ?

terroreek commented 5 years ago

just did.

ocerman commented 4 years ago

hello @terroreek, if you are still around, is it possible to post output of lspci -nn | grep AMD ?

I have found the evidence, that threadrippers are reporting SOC values, but only on particular node (the node #0). So now I need to figure out which PCI device is corresponding to which node.

terroreek commented 4 years ago

I have since upgraded my computer to a 3960x, but I have the 2920x running on unraid. Here is the ouput from that system, same motherboard and cpu. Just different OS. I would have to see how install zenpower on that system.

00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
00:19.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
00:19.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
00:19.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
00:19.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
00:19.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
00:19.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
00:19.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
00:19.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset USB 3.1 xHCI Controller [1022:43ba] (rev 02)
01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset SATA Controller [1022:43b6] (rev 02)
01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset PCIe Bridge [1022:43b1] (rev 02)
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
02:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
02:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
02:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
0a:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
0b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
0b:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
0b:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
40:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
40:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
40:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
40:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
40:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
40:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
40:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
42:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] [1002:665f] (rev 81)
42:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Tobago HDMI Audio [Radeon R7 360 / R9 360 OEM] [1002:aac0]
43:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
43:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
43:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
44:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
44:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
ocerman commented 4 years ago

I have pushed the new commit that should fix this issue. The valid SOC values are only reported on node # 0 and Core values on node # 1 // edit: it should fix Zen/Zen+ threadrippers and EPYCs

ocerman commented 4 years ago

It would be nice if any 1st/2nd gen Threadripper user ( @terroreek @sitic @SecretChief ) could test this.

terroreek commented 4 years ago

@ocerman I'll try to get that for you on my 2nd Gen TR, I'll try to get that for you on the weekend. On my 3960x, I have the following from zenpower;

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_SoC:      1.08 V  
Tdie:         +49.1°C  (high = +95.0°C)
Tctl:         +49.1°C  
Tccd1:        +36.0°C  
SVI2_P_SoC:   24.20 W  
SVI2_C_SoC:   22.37 A  
sitic commented 4 years ago

TR 2950X, seems to work great, thanks!

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_SoC:    825.00 mV 
Tdie:         +67.6°C  (high = +95.0°C)
Tctl:         +94.6°C  
SVI2_P_SoC:   18.45 W  
SVI2_C_SoC:   22.37 A  

zenpower-pci-00cb
Adapter: PCI adapter
SVI2_Core:     1.16 V  
Tdie:         +67.5°C  (high = +95.0°C)
Tctl:         +94.5°C  
SVI2_P_Core: 163.52 W  
SVI2_C_Core: 143.41 A 

asuswmisensors-isa-0000
Adapter: ISA adapter
CPU Core Voltage:          1.18 V  
CPU SOC Voltage:         850.00 mV 
DRAM AB Voltage:           1.16 V  
DRAM CD Voltage:           1.18 V  
1.8V PLL Voltage:          1.77 V  
+12V Voltage:             11.90 V  
+5V Voltage:               4.93 V  
3VSB Voltage:              3.29 V  
VBAT Voltage:              3.18 V  
AVCC3 Voltage:             3.29 V  
SB 1.05V Voltage:          1.06 V  
CPU Core Voltage:          1.12 V  
CPU SOC Voltage:         820.00 mV 
DRAM AB Voltage:           1.20 V  
DRAM CD Voltage:           1.20 V  
[...]
CPU Temperature:          +67.0°C  
CPU Socket Temperature:   +52.0°C  
[...]
CPU VRM Output Current:  132.00 A  

I'll note that asuswmisensors reports two sets of values via the WMI interface of the UEFI because they are from two different sources apparently, see electrified/asus-wmi-sensors#4.

ocerman commented 4 years ago

@sitic looks good, thanks.

I will let this opened for a while and if no problems will be reported in the next few days, I will close this issue.