Open tiburcillo opened 3 years ago
I was also wondering about this.
Currently doesn't work with 5000 series, which isn't that surprising, but I was still hoping it would, since k10temp is just flatout useless with Ryzen 5000 as well.
It worked perfectly with the 3800X on this same motherboard (X570 Taichi) which has the Nuvaton SuperIO chip.
Would be absolutly great, if this works with the 5000 series!
k10temp supports Zen3 from kernel >=5.10.
k10temp supports Zen3 from kernel >=5.10.
Yeah. It gives Tdie and Tctl temps. That's literally it.
zenpower gives detailed voltage and power draw readings. None of that is available in k10temp.
This is literally all you get for CPU in k10temp on 5.10 for Zen 3:
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +32.8°C
Tdie: +32.8°C
Pretty lackluster. There's a reason we're asking for zenpower support. I made my above comment while running 5.10, so I was already well aware of how well it "works" with 5.10.
Oh no, Vcore or Isoc etc for ZEN3 in 5.10?
Well, I can try to add that support but I'm not really familiar with that code. I can look at what 5.10 did and add the IDs, and then change the logic in zenpower_probe().. However, I cannot guarantee that is accurate or will work.
Give me some minutes to figure that :)
Yep. Tested it, no dice.
From skimming zenpower.c, it seems there's a lot of other areas where support would need to be added, just adding those few lines wouldn't seem to be enough (granted, my knowledge of how zenpower works is limited so this might not be the case).
But yeah, I get the exact same output as before.
zenpower-pci-00c3
Adapter: PCI adapter
Tdie: +73.5°C (high = +95.0°C)
Tctl: +73.5°C
And that much worked without the patch, too (meaning that replacing k10temp w/zenpower gave me the same info just named as zenpower instead of k10temp).
No, there is not much else, it just means the PLANE address is wrong for ZEN3 or the model IDs or both, and that includes the kernel itself. Someone with ZEN3 HW should report to lkml I guess.
There is no support whatso ever for fam 19h in zenpower before the patch, what means it got defaults and it seems to get defaults even now with fam19h added.
Btw are you sure you rmmod zenpower before loading the patched one?
@gardotd426
I think I missed something.. in my patch change data->zen3 = true; to data->zen2 = true, just to test something, the address and calculation look the same on both zen2 & zen3 so it should not really matter.
I'm sure I loaded the right zenpower because I didn't even have it installed before this patch, I'd uninstalled it because it was useless, and was using k10temp. I rmmod-ed k10temp and loaded zenpower after installing. I'll try editing the patch and running again.
Same result, unfortunately. If I knew exactly what was missing I'd bug the guys @ lkml
@gardotd426
k10temp should have Vcore etc. I'll try to find out myself the right offsets for ZEN3, bc I think there is something missing even in mainline.
Unfortunately, I don't have a ZEN3 box yet, prices for a 5950x are way too insane right now :)
Hahah yeah trust me I get it, I was going for the 5900X but you can't buy one anywhere (and I refuse to encourage scalpers), and the only way I could even get the 5800X @ MSRP was through a Newegg combo deal (they aren't selling them individually hardly at all) w/ a 500GB Samsung 980 Pro even though all three of my NVME slots are already taken up with 1GB NVMEs, so I just sold the 980 Pro on ebay for like 10 bucks less than I paid for it.
I still might get a 5900X later for the cores, but a 5800X is perfectly fine and in gaming it's pretty much the same as the 5900X and it definitely doesn't bottleneck my RTX 3090.
If you need help or testing or anything like that I'm happy to do it
@gardotd426
Out of curiosity, what does the kernel report on the CPU?
Something like this should tell:
dmesg | grep CPU0: | grep smpboot
Output for 5900X: [ 0.111779] smpboot: CPU0: AMD Ryzen 9 5900X 12-Core Processor (family: 0x19, model: 0x21, stepping: 0x0)
[ 0.109997] smpboot: CPU0: AMD Ryzen 7 5800X 8-Core Processor (family: 0x19, model: 0x21, stepping: 0x0)
I think I see the bug :)
?
@gardotd426
give me a moment to create some theoretical patch just to see if it starts working.
Alrighty
?
Somone committed with the stepping ids :) But the data want the model
Yeah, just tried out your idea, and it's now working. Copy and paste is broken on Firefox Wayland for some reason right now, but there's a heap of data now.
EDIT:
SVI2_Core: 1.55 V
SVI2_SoC: 1.48 V
Tdie: +44.6°C (high = +95.0°C)
Tctl: +44.6°C
Tccd1: +39.8°C
Tccd2: +38.0°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 17.56 W
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 15.87 A
Yeah, just tried out your idea, and it's now working. Copy and paste is broken on Firefox Wayland for some reason right now, but there's a heap of data now.
What did you do?
Yeah, just tried out your idea, and it's now working. Copy and paste is broken on Firefox Wayland for some reason right now, but there's a heap of data now.
EDIT:
SVI2_Core: 1.55 V SVI2_SoC: 1.48 V Tdie: +44.6°C (high = +95.0°C) Tctl: +44.6°C Tccd1: +39.8°C Tccd2: +38.0°C SVI2_P_Core: 0.00 W SVI2_P_SoC: 17.56 W SVI2_C_Core: 0.00 A SVI2_C_SoC: 15.87 A
Yes is broken in the kernel the same way.
I wondered why it pulls default code at all, that is bc the switch(...) data is wrong
@abucodonosor With your new patch, it now does something:
# sensors zenpower-*
zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core: 1.55 V
SVI2_SoC: 925.00 mV
Tdie: +30.4°C (high = +95.0°C)
Tctl: +30.4°C
Tccd1: +27.5°C
Tccd2: +29.0°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 543.90 mW
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 882.00 mA
Should we file a bug w/ the kernel?
On Mon, Dec 21, 2020 at 6:49 PM abucodonosor notifications@github.com wrote:
Yeah, just tried out your idea, and it's now working. Copy and paste is broken on Firefox Wayland for some reason right now, but there's a heap of data now.
EDIT:
SVI2_Core: 1.55 V
SVI2_SoC: 1.48 V
Tdie: +44.6°C (high = +95.0°C)
Tctl: +44.6°C
Tccd1: +39.8°C
Tccd2: +38.0°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 17.56 W
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 15.87 A
Yes is broken in the kernel the same way.
I wondered why it pulls default code at all, that is bc the switch(...) data is wrong
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ocerman/zenpower/issues/39#issuecomment-749258265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5Y334YCCV5X3O2C5CACULSV7NJNANCNFSM4UGCUEKA .
@gardotd426
Yes, and the fix is simple for the kernel, this:
diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
index a250481b5a97..0b4e61bf90f7 100644
--- a/drivers/hwmon/k10temp.c
+++ b/drivers/hwmon/k10temp.c
@@ -541,7 +541,7 @@ static int k10temp_probe(struct pci_dev *pdev, const struct pci_device_id *id)
data->is_zen = true;
switch (boot_cpu_data.x86_model) {
- case 0x0 ... 0x1: /* Zen3 */
+ case 0x21: /* Zen3 */
data->show_current = true;
data->svi_addr[0] = F19H_M01_SVI_TEL_PLANE0;
data->svi_addr[1] = F19H_M01_SVI_TEL_PLANE1;
Someone may try and confirm k10temp working too :)
So still some offset ( maybe ) wrong, it may be from ZEN2 code need to check but can't see what is it right now.
SVI2_P_Core: 0.00 W SVI2_C_Core: 0.00 A
Does this do something under load?
k10temp working:
k10temp-pci-00c3
Adapter: PCI adapter
Vcore: 1.55 V
Vsoc: 975.00 mV
Tctl: +53.2°C
Tdie: +53.2°C
Tccd1: +44.8°C
Tccd2: +40.5°C
Icore: 0.00 A
Isoc: 4.96 A
Looks to be a bit less data than Zenpower, though.
@aqxa1
Thx, so there is the Icore or the SVI2_P_Core in zenpower wrong. Probably wrong offset.
@gardotd426 that should be reported to kernel people too.
I try to find out the right one but that is a pain with the current AMD documentation ;)
And yeah, neither of those do anything for me under load (P_Core and C_Core), but P_SoC and C-SoC are both active.
Is there a way we can veryfy the definition of F19H_M01H_SVI_TEL_PLANE0 and PLANE1?
No, under load Core remains at 0W and 0A but the values for SoC rise. From the reading I get, I'd guess that what is reported as SoC is actually the Core. I (foolishly) changed the definitions to
#define F19H_M01H_SVI_TEL_PLANE0 (F17H_M01H_SVI + 0x10)
#define F19H_M01H_SVI_TEL_PLANE1 (F17H_M01H_SVI + 0xC)
and now Core and SoC volatges at least make the impression of being somewhat in the right area. The SoC wattage and ampereage seem plausible, but the Core still reports 0W and 0A
zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core: 932.00 mV
SVI2_SoC: 994.00 mV
Tdie: +30.2°C (high = +95.0°C)
Tctl: +30.2°C
Tccd1: +28.2°C
Tccd2: +28.2°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 6.73 W
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 6.77 A
Edit, my bad, it seems to work. Under load (one core) i get:
SVI2_Core: 963.00 mV
SVI2_SoC: 994.00 mV
Tdie: +30.8°C (high = +95.0°C)
Tctl: +30.8°C
Tccd1: +31.0°C
Tccd2: +30.5°C
SVI2_P_Core: 4.44 W
SVI2_P_SoC: 5.56 W
SVI2_C_Core: 4.61 A
SVI2_C_SoC: 5.59 A
Here's my output:
zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core: 1.55 V
SVI2_SoC: 1.47 V
Tdie: +34.5°C (high = +95.0°C)
Tctl: +34.5°C
Tccd1: +43.0°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 10.76 W
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 7.36 A
Yeah, those changes look fairly accurate now:
zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core: 1.25 V
SVI2_SoC: 988.00 mV
Tdie: +74.5°C (high = +95.0°C)
Tctl: +74.5°C
Tccd1: +71.5°C
Tccd2: +71.5°C
SVI2_P_Core: 132.50 W
SVI2_P_SoC: 6.06 W
SVI2_C_Core: 106.00 A
SVI2_C_SoC: 6.13 A
Values also look correct with k10temp, so definitely an oversight from the kernel devs.
Is there a way we can verify the definition of F19H_M01H_SVI_TEL_PLANE0 and PLANE1?
Well, I trusted 'AMD' people who committed that to the kernel itself. One may think they should know what they are doing but ...
#define F19H_M01H_SVI_TEL_PLANE0 (F17H_M01H_SVI + 0x10) #define F19H_M01H_SVI_TEL_PLANE1 (F17H_M01H_SVI + 0xC)
One can play with these right, but these are exactly the other way around for ZEN generic, PLANE0 is 0xc while PLANE1 is 0x10.
Wait how did you guys fix the wattage readings?
Yeah, those changes look fairly accurate now:
zenpower-pci-00c3 Adapter: PCI adapter SVI2_Core: 1.25 V SVI2_SoC: 988.00 mV Tdie: +74.5°C (high = +95.0°C) Tctl: +74.5°C Tccd1: +71.5°C Tccd2: +71.5°C SVI2_P_Core: 132.50 W SVI2_P_SoC: 6.06 W SVI2_C_Core: 106.00 A SVI2_C_SoC: 6.13 A
Cool, then, for now, my patch should be at least a workaround for you guys.
Code needs a bit of refactoring but this is not my call.
Also, we found the bug in k10temp so, fixed 2 things while looking at this :)
Thx everyone for testing :)
Wait how did you guys fix the wattage readings?
Seems to only work under load, or need a while to read something out.
But I've got that on my EPCY box also, sometimes this is ZERO until the box is doing something, however that is ZEN1 :)
No worries, and thanks for looking into it.
The values where a pure guess from my side so don't trust them in any way. Under full load the core voltage rises, so that looks okay. But P_Core is reported as 73W while the system consumes 203W out of the wall plug. So something still seems wrong. But thanks for your work so far.
Mine only goes up to 30W under full load, so it's definitely not reading right :/
zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core: 1.55 V
SVI2_SoC: 1.43 V
Tdie: +62.9°C (high = +95.0°C)
Tctl: +62.9°C
Tccd1: +54.2°C
SVI2_P_Core: 0.00 W
SVI2_P_SoC: 24.44 W
SVI2_C_Core: 0.00 A
SVI2_C_SoC: 17.07 A
This is during a Geekbench benchmark while all cores were turboing at 4.8GHz (these chips are monsters), so yeah...
You need to set these:
#define F19H_M01H_SVI_TEL_PLANE0 (F17H_M01H_SVI + 0x10)
#define F19H_M01H_SVI_TEL_PLANE1 (F17H_M01H_SVI + 0xC)
My core values do seem correct on my system (up to 150W). I'm not sure Core includes the full package (but I could be wrong), so the values could be reported lower than the full power usage.
@aqxa1 What processor are you using? One or two CCDs?
@hattedsquirrel 5900x, two CCD. Have been testing by re-compiling Mesa.
@aqxa1
You are correct regarding the defines for PLANE{0,1}. I contacted someone who confirmed they are the same as for ZEN2, so both are wrong mainline too.
Shall I create test3.patch ?
If you don't mind that'd be great
@aqxa1 Hm, ok. Same as mine. I can get it up to 74W reported for the cores while the plug power rises by 170W when loading up the cores. After subtracting conversion losses I'd expext the cores to consume between 120-140W, which would also match the PPT limit (142W for the whole package).
Would it be a lot of work to add support for the Zen3 family? I love zenpower on my 2700x, would be cool if my 5600x would also be supported.
Thanks, t