Closed svmlegacy closed 1 year ago
Thanks
Probably we have to scan for a second Controller. I will add this in a testing branch.
Is DDR4 Speed 2133 MT/s
ok for you ?
The following is empty. May be there's an errata.
|- MONITOR/MWAIT
|- State index: #0 #1 #2 #3 #4 #5 #6 #7
|- Sub C-State: 0 0 0 0 0 0 0 0
Topology has two CCX IDs of 0
and 2
rather than 1
!
There is a Zen1 errata on Instructions Counter. That's why you read erratic values.
In function AMD_DataFabric_Zeppelin()
can you please replace the umc_max
from 1
to 2
:
https://github.com/cyring/CoreFreq/blob/a0eeedae23df8d4e4326bbc06a6b6f396ed9449e/corefreqk.c#L6592
Edit: I have fixed it since the first answer.
static PCI_CALLBACK AMD_DataFabric_Zeppelin(struct pci_dev *pdev)
{
if (strncmp(PUBLIC(RO(Proc))->Architecture,
Arch[PUBLIC(RO(Proc))->ArchID].Architecture[CN_WHITEHAVEN],
CODENAME_LEN) == 0)
{
return AMD_17h_DataFabric( pdev,
(const unsigned int[2][2]) {
{ 0x0, 0x20},
{0x10, 0x28}
},
0x30, 0x80,
2, MC_MAX_CHA,
(const unsigned int[]) {PCI_DEVFN(0x18, 0x0),
PCI_DEVFN(0x19, 0x0)} );
}
else
{
return AMD_17h_DataFabric( pdev,
(const unsigned int[2][2]) {
{ 0x0, 0x20},
{0x10, 0x28}
},
0x30, 0x80,
1, MC_MAX_CHA,
(const unsigned int[]) {PCI_DEVFN(0x18, 0x0)} );
}
}
Rebuild, try the Memory Controller and post its output.
Also track your kernel log for any message as bellow:
CoreFreq: AMD_17h_DataFabric()
Break UMC(%hu) probing @ PCI(0x%x:0x0:0x%x)
Using the code code change above, can you also show me the Memory Controller output of your Ryzen 7 1700X
?
This memory is currently running at 2133 MHz, so the measurement is valid.
Modified code is producing expected results:
$ ./corefreq-cli -M
Zen UMC [1460]
Controller #0 Dual Channel
Bus Rate 1066 MHz Bus Speed 1066 MHz DDR4 Speed 2133 MT/s
Cha CL RCDr RCDw RP RAS RC RRDs RRDl FAW WTRs WTRl WR clRR clWW
#0 15 15 15 15 36 51 4 6 23 3 8 16 3 3
#1 15 15 15 15 36 51 4 6 23 3 8 16 3 3
CWL RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
#0 11 8 9 0 1 5 5 1 3 3 0 0 0 0
#1 11 8 10 0 1 5 5 1 3 3 0 0 0 0
REFI RFC1 RFC2 RFC4 RCPB RPPB BGS:Alt Ban Page CKE CMD GDM ECC
#0 8316 312 192 132 0 0 OFF ON R0W0 0 6 1T OFF 0
#1 8316 312 192 132 0 0 OFF ON R0W0 0 6 1T OFF 0
MRD:PDA MOD:PDA WRMPR STAG PDM RDDATA WRD WRL RDL XS XP CPDED
#0 8 16 24 24 24 6 0:P:0 10 2 6 20 384 7 4
#1 8 16 24 24 24 6 0:P:0 10 2 6 22 384 7 4
DIMM Geometry for channel #0
Slot Bank Rank Rows Columns Memory Size (MB)
#0
#1 16 1 65536 1024 8192 CMT32GX4M4C3200C16
DIMM Geometry for channel #1
Slot Bank Rank Rows Columns Memory Size (MB)
#0
#1 16 1 65536 1024 8192 CMT32GX4M4C3200C16
Controller #1 Dual Channel
Bus Rate 1066 MHz Bus Speed 1066 MHz DDR4 Speed 2133 MT/s
Cha CL RCDr RCDw RP RAS RC RRDs RRDl FAW WTRs WTRl WR clRR clWW
#0 15 15 15 15 36 51 4 6 23 3 8 16 3 3
#1 15 15 15 15 36 51 4 6 23 3 8 16 3 3
CWL RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
#0 11 8 9 0 1 5 5 1 3 3 0 0 0 0
#1 11 8 10 0 1 5 5 1 3 3 0 0 0 0
REFI RFC1 RFC2 RFC4 RCPB RPPB BGS:Alt Ban Page CKE CMD GDM ECC
#0 8316 312 192 132 0 0 OFF ON R0W0 0 6 1T OFF 0
#1 8316 312 192 132 0 0 OFF ON R0W0 0 6 1T OFF 0
MRD:PDA MOD:PDA WRMPR STAG PDM RDDATA WRD WRL RDL XS XP CPDED
#0 8 16 24 24 24 6 0:P:0 10 2 6 20 384 7 4
#1 8 16 24 24 24 6 0:P:0 10 2 6 22 384 7 4
DIMM Geometry for channel #0
Slot Bank Rank Rows Columns Memory Size (MB)
#0
#1 16 1 65536 1024 8192 CMT32GX4M4C3200C16
DIMM Geometry for channel #1
Slot Bank Rank Rows Columns Memory Size (MB)
#0
#1 16 1 65536 1024 8192 CMT32GX4M4C3200C16
I'll get the 1700X's memory controller up in just a few minutes.
Not sure what's going on with the C-states. Motherboard does not have good options for them. (Or is the errata you mention the explanation for it?)
AMD Ryzen 7 1700X, same code:
$ ./corefreq-cli -M
Zen UMC [1460]
Controller #0 Dual Channel
Bus Rate 1066 MHz Bus Speed 1064 MHz DDR4 Speed 2129 MT/s
Cha CL RCDr RCDw RP RAS RC RRDs RRDl FAW WTRs WTRl WR clRR clWW
#0 15 15 15 15 36 51 4 6 23 3 8 16 3 3
#1 15 15 15 15 36 51 4 6 23 3 8 16 3 3
CWL RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
#0 11 8 9 0 1 6 6 1 4 4 0 0 0 0
#1 11 8 9 0 1 6 6 1 4 4 0 0 0 0
REFI RFC1 RFC2 RFC4 RCPB RPPB BGS:Alt Ban Page CKE CMD GDM ECC
#0 8316 374 278 171 0 0 OFF ON R0W0 0 6 1T OFF 0
#1 8316 374 278 171 0 0 OFF ON R0W0 0 6 1T OFF 0
MRD:PDA MOD:PDA WRMPR STAG PDM RDDATA WRD WRL RDL XS XP CPDED
#0 8 16 24 24 24 6 0:P:0 10 2 6 20 384 7 4
#1 8 16 24 24 24 6 0:P:0 10 2 6 20 384 7 4
DIMM Geometry for channel #0
Slot Bank Rank Rows Columns Memory Size (MB)
#0 16 1 65536 1024 8192 CMT32GX4M4C3200C16
#1
DIMM Geometry for channel #1
Slot Bank Rank Rows Columns Memory Size (MB)
#0 16 1 65536 1024 8192 CMT32GX4M4C3200C16
#1
Not sure what's going on with the C-states. Motherboard does not have good options for them. (Or is the errata you mention the explanation for it?)
About the missing Sub C-State
values, I have no idea if it is due to an errata but I presume there is. First series of Ryzen had some issues with C-states. I remember reading it was better to sleep with the HALT instruction rather than MWAIT to prevent a freeze.
My guess is that CPUID is returning zero Sub C-State
has a hint for the kernel idle function.
You can however register CoreFreq as the kernel CPU Idle handler; next you will invoke an idle method of your choice in the Settings
menu. See wiki/CoreFreq as the Clock Source, CPU Freq and CPU Idle driver
Keep an eye on voltage and power consumed to decide which method is appropriated and stable.
About the original Memory Controller, I will provide soon that code fix, including the EPYC and Zen+ TR multi UMC cases too. I just need volunteers to do the non regression tests on EPYC and other Threadripper Processors.
Memory Controller fix is committed in 706460f852f10159eb1492df62cb8c060c74ecbc
I need a Naples
test:
@munorc could you please run the latest commit with your EPYC and post here the Memory Controller output ?
corefreq-cli -m
CPU Pkg Apic Core/Thread Caches (w)rite-Back (i)nclusive
# ID ID CCD CCX ID/ID L1-Inst Way L1-Data Way L2 Way L3 Way
000:BSP 0 0 0 0 0 64 4 32 8 512 8 i 16384 32w
001: 0 2 0 0 1 0 64 4 32 8 512 8 i 16384 32w
002: 0 4 0 0 2 0 64 4 32 8 512 8 i 16384 32w
003: 0 6 0 0 3 0 64 4 32 8 512 8 i 16384 32w
004: 1 16 1 2 8 0 64 4 32 8 512 8 i 16384 32w
005: 1 18 1 2 9 0 64 4 32 8 512 8 i 16384 32w
006: 1 20 1 2 10 0 64 4 32 8 512 8 i 16384 32w
007: 1 22 1 2 11 0 64 4 32 8 512 8 i 16384 32w
008: 0 1 0 0 0 1 64 4 32 8 512 8 i 16384 32w
009: 0 3 0 0 1 1 64 4 32 8 512 8 i 16384 32w
010: 0 5 0 0 2 1 64 4 32 8 512 8 i 16384 32w
011: 0 7 0 0 3 1 64 4 32 8 512 8 i 16384 32w
012: 1 17 1 2 8 1 64 4 32 8 512 8 i 16384 32w
013: 1 19 1 2 9 1 64 4 32 8 512 8 i 16384 32w
014: 1 21 1 2 10 1 64 4 32 8 512 8 i 16384 32w
015: 1 23 1 2 11 1 64 4 32 8 512 8 i 16384 32w
About CCX
falling in a set of {0, 2}
, I'm referring to the "AMD diagonal configuration" mentioned in this TechPowerUp's article. It would that mean no CCX
number 1
or 3
.
I have received results from EPYC: no regression encountered.
https://github.com/cyring/CoreFreq/issues/388#issuecomment-1579679040
https://github.com/cyring/CoreFreq/issues/388#issuecomment-1579694814
Genoa EPYC is still unknown to me ; just got Raphael results.
Feel free to close the issue.
Regards Cyril
2 of 4 memory channels shown (all 4 populated in this case)