cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.97k stars 126 forks source link

[SOLVED] Incorrect memory XMP frequency #303

Closed mxw39 closed 2 years ago

mxw39 commented 2 years ago

Thanks for making this awesome tool. I'm truly impressed by the number of functionalities offered and the clean UI which presents them all (with resizable and repositionable floating windows even)!

The Memory Controller window doesn't appear to offer the correct DRAM speed. I have mine on XMP and BIOS shows 3600 MHz (MT/s). In CoreFreq it shows the default CPU-supported DRAM speed 2667 MHz.

Can you help me understand why CoreFreq gives a different speed?

P.S. CAS Latency seems correct. I didn't closely check other timing readings.

cyring commented 2 years ago

Thank you for using CoreFreq. Can you provide some data like :

corefreq-cli -s -m -M
mxw39 commented 2 years ago

Certainly.

% corefreq-cli -s -m -M
Processor                             [Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz]
|- Architecture                                                  [Coffee Lake/R]
|- Vendor ID                                                      [GenuineIntel]
|- Microcode                                                        [0x000000ea]
|- Signature                                                           [  06_9E]
|- Stepping                                                            [     13]
|- Online CPU                                                          [ 16/ 16]
|- Base Clock                                                          [100.000]
|- Frequency            (MHz)                      Ratio                        
                 Min    799.99                    <   8 >                       
                 Max   3599.95                    <  36 >                       
|- Factory                                                             [100.000]
                       3600                       [  36 ]                       
|- Performance                                                                  
   |- P-State                                                                   
                 TGT   4999.94                    <  50 >                       
   |- HWP                                                                       
                 Min   4999.94                    <  50 >                       
                 Max   4999.94                    <  50 >                       
                 TGT      AUTO                    <   0 >                       
|- Turbo Boost                                                         [ UNLOCK]
                  1C   4999.94                    <  50 >                       
                  2C   4999.94                    <  50 >                       
                  3C   4999.94                    <  50 >                       
                  4C   4999.94                    <  50 >                       
                  5C   4999.94                    <  50 >                       
                  6C   4999.94                    <  50 >                       
                  7C   4999.94                    <  50 >                       
                  8C   4999.94                    <  50 >                       
|- Uncore                                                              [ UNLOCK]
                 Min    799.99                    <   8 >                       
                 Max   4699.94                    <  47 >                       
|- TDP                                                           Level [  0:3  ]
   |- Programmable                                                     [ UNLOCK]
   |- Configuration                                                    [   LOCK]
   |- Turbo Activation                                                 [ UNLOCK]
             Nominal   3599.95                    [  36 ]                       
               Turbo      AUTO                    <   0 >                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [Y]          AES [Y]  AVX/AVX2 [Y/Y] 
|- AVX512-F     [N]    AVX512-DQ [N]  AVX512-IFMA [N]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [N]    AVX512-BW [N]   AVX512-VL [N] 
|- AVX512-VBMI  [N] AVX512-VBMI2 [N]  AVX512-VNMI [N]  AVX512-ALG [N] 
|- AVX512-VPOP  [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N] 
|- AVX512-BF16  [N]  BMI1/BMI2 [Y/Y]         CLWB [N] CLFLUSH/O [Y/Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [Y]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- MMX/Ext    [Y/N] MON/MWAITX [Y/N]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [N]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/N]      SSE4.2 [Y] 
|- SERIALIZE    [N]      SYSCALL [Y]          SGX [Y]       RDPID [N] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Missing]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Capable]
|- CPL Qualified Debug Store                                  DS-CPL   [Capable]
|- 64-Bit Debug Store                                         DTES64   [Capable]
|- Fast-String Operation                                Fast-Strings   [Capable]
|- Fused Multiply Add                                     FMA | FMA4   [Capable]
|- Hardware Lock Elision                                         HLE   [Missing]
|- Instruction Based Sampling                                    IBS   [Missing]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- LightWeight Profiling                                         LWP   [Missing]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Memory Protection Extensions                                  MPX   [Capable]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Capable]
|- Process Context Identifiers                                  PCID   [Capable]
|- Perfmon and Debug Capability                                 PDCM   [Capable]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Resource Director Technology/PQE                            RDT-A   [Missing]
|- Resource Director Technology/PQM                            RDT-M   [Missing]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Capable]
|- Self-Snoop                                                     SS   [Capable]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Capable]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Missing]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Capable]
|- Extended xAPIC Support                                     x2APIC   [  xAPIC]
|- Execution Disable Bit Support                              XD-Bit   [Capable]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Capable]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [ Enable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [Capable]
|- Speculative Store Bypass Disable                             SSBD   [Capable]
|- Writeback & invalidate the L1 data cache                L1D-FLUSH   [Capable]
|- Hypervisor - No flush L1D on VM entry            L1DFL_VMENTRY_NO   [ Enable]
|- Architectural - Buffer Overwriting                       MD-CLEAR   [Capable]
|- Architectural - Rogue Data Cache Load                     RDCL_NO   [ Enable]
|- Architectural - Enhanced IBRS                            IBRS_ALL   [ Enable]
|- Architectural - Return Stack Buffer Alternate                RSBA   [Capable]
|- Architectural - Speculative Store Bypass                   SSB_NO   [Capable]
|- Architectural - Microarchitectural Data Sampling           MDS_NO   [ Enable]
|- Architectural - TSX Asynchronous Abort                     TAA_NO   [Capable]
|- Architectural - Page Size Change MCE               PSCHANGE_MC_NO   [Capable]
|- Architectural - STLB QoS                                     STLB   [Missing]
|- Architectural - Functional Safety Island                     FuSa   [Missing]
|- Architectural - RSM in CPL0 only                              RSM   [Missing]
|- Architectural - Split Locked Access Exception                SPLA   [Missing]
|- Architectural - Snoop Filter QoS Mask                SNOOP_FILTER   [Missing]

Technologies                                                                    
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L2 Line Prefetcher                                        L2 HW CL   < ON>
|- System Management Mode                                       SMM-Dual   [ ON]
|- Hyper-Threading                                                   HTT   [ ON]
|- SpeedStep                                                        EIST   < ON>
|- Dynamic Acceleration                                              IDA   [ ON]
|- Turbo Boost                                                     TURBO   < ON>
|- Energy Efficiency Optimization                                    EEO   <OFF>
|- Race To Halt Optimization                                         R2H   <OFF>
|- Watchdog Timer                                                    TCO   < ON>
|- Virtualization                                                    VMX   [ ON]
   |- I/O MMU                                                       VT-d   [ ON]
   |- Version                                                     [         1.0]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  4]
|- Counters:          General                   Fixed                           
|                     4 x 48 bits             3 x 48 bits                       
|- Enhanced Halt State                                           C1E       <OFF>
|- C1 Auto Demotion                                              C1A       <OFF>
|- C3 Auto Demotion                                              C3A       <OFF>
|- C1 UnDemotion                                                 C1U       <OFF>
|- C3 UnDemotion                                                 C3U       <OFF>
|- C6 Core Demotion                                              CC6       <OFF>
|- C6 Module Demotion                                            MC6       <OFF>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Hardware-Controlled Performance States                        HWP       < ON>
   |- Capabilities      (MHz)                      Ratio                        
              Lowest    100.00                    [   1 ]                       
           Efficient   1200.00                    [  12 ]                       
          Guaranteed   3599.99                    [  36 ]                       
             Highest   4999.99                    [  50 ]                       
|- Hardware Duty Cycling                                         HDC       <OFF>
|- Package C-States                                                             
   |- Configuration Control                                   CONFIG   [   LOCK]
   |- Lowest C-State                                           LIMIT   <     C0>
   |- I/O MWAIT Redirection                                  IOMWAIT   <Disable>
   |- Max C-State Inclusion                                    RANGE   <     C0>
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x0   ]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     0     2     1     2     4     1     1     1              
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Last Level Cache Misses                                             [Capable]
|- Branch Instructions Retired                                         [Capable]
|- Branch Mispredicts Retired                                          [Capable]
|- Top-down slots Counter                                              [Capable]

Power, Current & Thermal                                                        
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]
|- Power Management                                         PWR MGMT   [   LOCK]
   |- Energy Policy                                        Bias Hint   <      6>
   |- Energy Policy                                          HWP EPP   <      0>
|- Junction Temperature                                        TjMax   [ 0:100C]
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Capable]
|- Package Thermal Management                                    PTM   [Capable]
|- Thermal Monitor 1                                             TM1   [ Enable]
|- Thermal Monitor 2                                             TM2   [Capable]
|- Thermal Design Power                                          TDP   [   95 W]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   < Enable>
   |- Power Limit (28 sec)                                       PL1   < 4095 W>
   |- Power Limit (1 sec)                                        PL2   < 4095 W>
|- Thermal Design Power                                         Core   <Disable>
   |- Power Limit                                                PL1   [Missing]
|- Thermal Design Power                                       Uncore   <Disable>
   |- Power Limit                                                PL1   [Missing]
|- Thermal Design Power                                         DRAM   <Disable>
   |- Power Limit                                                PL1   [Missing]
   |- Power Limit (1 sec)                                        PL2   [   27 W]
|- Thermal Design Power                                     Platform   <Disable>
   |- Power Limit                                                PL1   [Missing]
   |- Power Limit                                                PL2   [Missing]
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000061035]
   |- Window                                            second   [  0.000976562]
CPU Pkg  Apic  Core/Thread  Caches      (w)rite-Back (i)nclusive              
 #   ID   ID    ID     ID  L1-Inst Way  L1-Data Way      L2  Way      L3  Way 
000:BSP    0     0      0    32768  8     32768  8    262144  4  16777216 16 i
001:  0    2     1      0    32768  8     32768  8    262144  4  16777216 16 i
002:  0    4     2      0    32768  8     32768  8    262144  4  16777216 16 i
003:  0    6     3      0    32768  8     32768  8    262144  4  16777216 16 i
004:  0    8     4      0    32768  8     32768  8    262144  4  16777216 16 i
005:  0   10     5      0    32768  8     32768  8    262144  4  16777216 16 i
006:  0   12     6      0    32768  8     32768  8    262144  4  16777216 16 i
007:  0   14     7      0    32768  8     32768  8    262144  4  16777216 16 i
008:  0    1     0      1    32768  8     32768  8    262144  4  16777216 16 i
009:  0    3     1      1    32768  8     32768  8    262144  4  16777216 16 i
010:  0    5     2      1    32768  8     32768  8    262144  4  16777216 16 i
011:  0    7     3      1    32768  8     32768  8    262144  4  16777216 16 i
012:  0    9     4      1    32768  8     32768  8    262144  4  16777216 16 i
013:  0   11     5      1    32768  8     32768  8    262144  4  16777216 16 i
014:  0   13     6      1    32768  8     32768  8    262144  4  16777216 16 i
015:  0   15     7      1    32768  8     32768  8    262144  4  16777216 16 i
                            Cannon Point  [3E30]                           
Controller #0                                                Dual Channel  
 Bus Rate  8000 MT/s      Bus Speed 7999 MT/s          DRAM Speed 2667 MHz 

 Cha    CL  RCD   RP  RAS  RRD  RFC   WR RTPr WTPr  FAW  B2B  CWL CMD  REFI
  #0    16   18   18   38    9  631   24   12   44   38    0   16  2T 14040
  #1    16   18   18   38    9  631   24   12   44   38    0   16  2T 14040
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0     7    4    6    6        12   12   12   12        36   27   10   10
  #1     7    4    6    6        12   12   12   12        36   27   10   10
      sgWW dgWW drWW ddWW                                         CKE   ECC
  #0     7    4   10   10                                           4    0 
  #1     7    4   10   10                                           4    0 

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1
cyring commented 2 years ago

Can you print or debug (gdb) the DMFC value at the following switch line:

https://github.com/cyring/CoreFreq/blob/478eee81930e1c339f13787a17d7d0ffe2231e2d/corefreqd.c#L3937

For instance, if DMFC is equal to zero then Controller Speed is set to 2667. You can replace it with your BIOS frequency.

mxw39 commented 2 years ago

Breaking at the switch line shows DMFC is 0b1. It leads to the code that assigns 2667 to the control speed.

But I doubt replacing the frequency assigned here is getting to the bottom of it. The memory is overclocked in BIOS, which is where the 3600 MHz originates. Is it possible to read the running clock speed of the memory from BIOS or some CPU registers at all?

cyring commented 2 years ago

Breaking at the switch line shows DMFC is 0b1. It leads to the code that assigns 2667 to the control speed.

But I doubt replacing the frequency assigned here is getting to the bottom of it. The memory is overclocked in BIOS, which is where the 3600 MHz originates. Is it possible to read the running clock speed of the memory from BIOS or some CPU registers at all?

The i9900K is part of the 9th generation. From my Wiki/Doocumentation, the link "Intel® Core™ Processors Technical Resources" sends us to the Intel datasheets where we download/preview the "9th/8th Generation Intel® Core™ Processor Family" datasheet volume two.

2021-12-07-080645_670x217_scrot

cyring commented 2 years ago

EDIT: Some registers later published for TGL, like the DATA_RATE_GEAR1 field (may help)

https://github.com/cyring/CoreFreq/blob/478eee81930e1c339f13787a17d7d0ffe2231e2d/intelmsr.h#L3498

At this location ... https://github.com/cyring/CoreFreq/blob/478eee81930e1c339f13787a17d7d0ffe2231e2d/corefreqd.c#L3932 ... can you print the full hexa value of the Cap_C register


printf("%x\n", RO(Proc)->Uncore.Bus.SKL_Cap_C.value);
mxw39 commented 2 years ago

The value gets read twice on corefreqd startup. I get 0x2c000 for the SKL_Cap_C.value both times.

Assuming lowest 4 bits (?) on SKL corresponds to DATA_RATE_GEAR1 as on RKL it should be 0 * 266 MHz... that doesn't seem to make sense.

cyring commented 2 years ago

Found this one in vol 2 datasheet:

7.91 PCU_CR_MC_BIOS_REQ_0_0_0_MCHBAR_PCU— Offset 5E00h

This register allows BIOS to request Memory Controller clock frequency.

The encoding of this field is the 133/266 MHz multiplier for DCLK/QCLK

cyring commented 2 years ago

To peek the above register, in corefreqk.c driver code, add at the beginning of function Query_SKL_IMC:

https://github.com/cyring/CoreFreq/blob/478eee81930e1c339f13787a17d7d0ffe2231e2d/corefreqk.c#L4104

void Query_SKL_IMC(void __iomem *mchmap, unsigned short mc)
{   /*Source: 6th & 7th Generation Intel® Processor for S-Platforms Vol 2*/
    unsigned int MC_BIOS_REQ = 0;
    unsigned short cha;

    MC_BIOS_REQ = readl(mchmap+0x5e00);
    printk("MC_BIOS_REQ[%x]", MC_BIOS_REQ);
mxw39 commented 2 years ago

Thanks for the pointer to the datasheet.

I get 0x10d from mchmap+0x5e00. This calculation would put the frequency at 13 * 266.67 MHz == 3466 MHz, which differs from the BIOS settings. There are DRAMs marketed at 3466 MHz but the sticks I have aren't marketed at 3466 MHz. They are at 3600 MHz.

But I did find section 7.81 -- System Agent performance status. The lowest 8 bits are the DDR QCLK reference base and the ratio. I get 0x2c000a1b from mchmap+0x5918. This puts reference base == 0 (133 MHz) and the ratio 27, which gets the final frequency to be 3591 MHz, within error margin of 3600.

Do you think it makes sense to calculate DRAM frequency in this manner?

cyring commented 2 years ago

This is excellent 👍

And we should understand the following meaning from QCLK_REFERENCE == 0:

DRAM = (133333333 x 27) / 1000000

DRAM = 3600 MHz

cyring commented 2 years ago

EDIT: In addition to the above requests, can you also try standard DRAM frequency in BIOS ? I mean non-overclocked frequency. And check the resulted ratios in offset 0x5918

This will help us to define a fallback strategy; if the current switch-cases should be considered when offset 0x5918 is not meaningful.

mxw39 commented 2 years ago

That sounds like a plan. Thanks Cyril!

I read 7th gen datasheet and 0x5918 is still valid.

I have a put-together 7700K system and a spare 6700K CPU (hopefully still alive). Let me run several DRAM frequency with my current 9900K to determine if the reference * ratio formula is always valid. Then I'll test the same on the 7700K system.

cyring commented 2 years ago

@mxw39 : You can now pull the develop branch to test the DRAM frequency.

Remarks:

  1. Make sure to clean any previous builds.
    make clean all
  2. This new code applies to SKYLAKE, KABYLAKE, COFFEELAKE, WHISKEYLAKE, COMETLAKE, ICELAKE (but not COMETLAKE/U and COMETLAKE/H)

    Any non-regression tests are warmly welcomed from anyone.

mxw39 commented 2 years ago

Here are the results on the 9900K system. You can see a pattern that the base and ratio gets lower by 1 decrement respectively in SA_PERF_STAT. MC_BIOS_REQ does not quite make sense in this result set. Perhaps my BIOS doesn't implement this interface correctly. At one point it was 0x0 for the lower meaningful 4 bits.

I tried all settings from 3200 - 3600, and the default, non-OC settings.

BIOS freq   MC_BIOS_REQ     SA_PERF_STAT
3600        0x10d           0x2c000a1b
3500        0x131           0x2c000aa3
3467        0xd             0x2c000a1a
3400        0x31            0x2c000aa2
3333        0x10c           0x2c000a19
3000        0x130           0x2c000aa1
3200        0xc             0x2c000a18

2667        0xa             0x2c000a14
2600        0x1d            0x2c000a9a
2600(auto)  0xa             0x2c000a14

2600(auto) is the BIOS auto frequency when all OC settings are removed. But clearly it is running 2667 because the register values are the same as when I set 2667 manually. This is perhaps a BIOS bug. CoreFreq develop branch calculates the correct value.

The 7700K system test will come later. I'll get my hands on it next week.

cyring commented 2 years ago

The ratio is not 4 but 7 bits wide. See this header file, based on specs: https://github.com/cyring/CoreFreq/blob/4e02ecbf709bd25c3e64bf28641212f600a1f277/intelmsr.h#L3359 Can you please show the Memory Controller output of CoreFreq

cyring commented 2 years ago

Sorry, my last answer was for EPYC. Can you however try the latest development from branch develop ?

mxw39 commented 2 years ago

Sorry this reply took a bit long.

I wasn't clear on my wording when I said decrement. I meant as I go down the BIOS defined frequency choices, the base alternates between 100 and 133. When grouping all 100 based ratio together, we find the ratio goes down by 1. The same for 133 based ratio.

BIOS freq   MC_BIOS_REQ     SA_PERF_STAT    BASE    RATIO
3600        0x10d           0x2c000a1b      133     27
3500        0x131           0x2c000aa3      100     35
3467        0xd             0x2c000a1a      133     26
3400        0x31            0x2c000aa2      100     34
3333        0x10c           0x2c000a19      133     25
3000        0x130           0x2c000aa1      100     33
3200        0xc             0x2c000a18      133     24

2667        0xa             0x2c000a14      133     20
2600        0x1d            0x2c000a9a      100     26
2600(auto)  0xa             0x2c000a14      133     20

With the current BIOS 3600 MHz settings, I get

% ./corefreq-cli -M
                            Cannon Point  [3E30]                           
Controller #0                                                Dual Channel  
 Bus Rate  8000 MT/s      Bus Speed 7999 MT/s          DDR4 Speed 3599 MHz 

 Cha    CL  RCD   RP  RAS  RRD  RFC   WR RTPr WTPr  FAW  B2B  CWL CMD  REFI
  #0    16   18   18   38    9  631   24   12   44   38    0   16  2T 14040
  #1    16   18   18   38    9  631   24   12   44   38    0   16  2T 14040
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0     7    4    6    6        12   12   12   12        36   27   10   10
  #1     7    4    6    6        12   12   12   12        36   27   10   10
      sgWW dgWW drWW ddWW                                         CKE   ECC
  #0     7    4   10   10                                           4    0 
  #1     7    4   10   10                                           4    0 

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1                                                                  
cyring commented 2 years ago

2021-12-17-073850_664x273_scrot

BIOS freq MC_BIOS_REQ MC_BIOS_REQ AND 0b1111 SA_PERF_STAT BASE RATIO
3600 0x10d 0xd 0x2c000a1b 133 27
3500 0x131 0x1 0x2c000aa3 100 35
3467 0xd 0xd 0x2c000a1a 133 26
3400 0x31 0x1 0x2c000aa2 100 34
3333 0x10c 0xc 0x2c000a19 133 25
3000 0x130 0x0 0x2c000aa1 100 33
3200 0xc 0xc 0x2c000a18 133 24
BIOS freq MC_BIOS_REQ MC_BIOS_REQ AND 0b1111 SA_PERF_STAT BASE RATIO
2667 0xa 0xa 0x2c000a14 133 20
2600 0x1d 0xd 0x2c000a9a 100 26
2600 (auto) 0xa 0xa 0x2c000a14 133 20
cyring commented 2 years ago

Those are the problematic cases:

BIOS freq MC_BIOS_REQ MC_BIOS_REQ AND 0b1111 SA_PERF_STAT BASE RATIO
3000 0x130 0x0 0x2c000aa1 100 33
2600 (auto) 0xa 0xa 0x2c000a14 133 20
mxw39 commented 2 years ago

Sorry, the BIOS freq == 3000 line is my typo. It should have been 3300.

I set my memory to 3300 again to confirm the value in MC_BIOS_REQ is correct. It is indeed 0x130 which means MC_BIOS_REQ & 0b1111 == 0x0 and makes no sense.

SA_PERF_STAT is however correct and develop version of corefreq at 1cccd64351bf34043f2fb8efa4e5abfdf4c9c4e3 gives this output

% ./corefreq-cli -M
                            Cannon Point  [3E30]                           
Controller #0                                                Dual Channel  
 Bus Rate  8000 MT/s      Bus Speed 8000 MT/s          DDR4 Speed 3300 MHz 

 Cha    CL  RCD   RP  RAS  RRD  RFC   WR RTPr WTPr  FAW  B2B  CWL CMD  REFI
  #0    16   18   18   38    9  631   24   12   42   38    0   14  2T 12998
  #1    16   18   18   38    9  631   24   12   42   38    0   14  2T 12998
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0     7    4    6    6        12   12   12   12        32   25   10   10
  #1     7    4    6    6        12   12   12   12        32   25   10   10
      sgWW dgWW drWW ddWW                                         CKE   ECC
  #0     7    4   10   10                                           4    0 
  #1     7    4   10   10                                           4    0 

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    2     65536      1024          32768                    
       #1                                                                  

With MC_BIOS_REQ & 0b1111 == 0x0 I find it hard to believe the value is at all meaningful. Should SA_PERF_STAT(0x5918) be available I think it is advisable to use it. It gives the same result as the BIOS at all times.

The only exception is the auto settings with all overclocking disabled. However, I have demonstrated the registers we are interested in are the same as those of 2677. So I am convinced that the memory is really running at 2677 despite BIOS showing 2600 when set to auto. It's not very hard to believe the BIOS isn't interfacing with MC_BIOS_REQ properly. A buggy BIOS may also explain why MC_BIOS_REQ is not making sense and why 2600 (auto) is really 2677.

About Skylake and derivative: I have gotten the 7700K system and should have some results for you today or tomorrow. This should help with our confidence SA_PERF_STAT(0x5918) is available on 7th Gen desktop at least.

cyring commented 2 years ago

@mxw39 well done. I think issue is almost fixed. Let's see with SKL non regression tests.

mxw39 commented 2 years ago

Here are the results from 7700K. SA_PERF_STAT (0x5918) still exists and the formula is correct.

BIOS freq   MC_BIOS_REQ     SA_PERF_STAT    BASE    RATIO   corefreq-cli -M
3200(XMP)   0xc             0x29000a18      133     24      3199
3100        0x11f           0x29000a9f      100     31      3100
3000        0x1f            0x29000a9e      100     30      3000
2933        0xb             0x29000a16      133     23      2933
2900        0x11e           0x29000a9d      100     29      2900
2800        0x1e            0x29000a9c      100     28      2800
2133(auto)  0x8             0x29000a10      133     16      2133
cyring commented 2 years ago

Looks like this is the right register to decode XMP frequencies.

In Wiki there is all Skylake tested so far, starting from 6600K : https://github.com/cyring/CoreFreq/wiki/CPU-support

The remaining work is to avoid a crash where SA_PERF_STAT is denied: what kind of capability bit we can query ?

mxw39 commented 2 years ago

That's a very good question. I looked at the 8th Gen datasheet but didn't find anything useful other than the default value will be 0x0 if not valid. I don't know yet what will happen if 0x5918 is read on an unsupported arch.

The current fallback logic based on QCLK == 0 should be good. When the register defaults to 0x0, QCLK will be as well.

cyring commented 2 years ago

SKL_IMC() is the entry function. All PCI DID associated with this function in the following lists will reach the 0x5918

https://github.com/cyring/CoreFreq/blob/d253207ee4fc5ce0ebe107b47a306e6b4468d95b/corefreqk.h#L2213

https://github.com/cyring/CoreFreq/blob/d253207ee4fc5ce0ebe107b47a306e6b4468d95b/corefreqk.h#L2250

cyring commented 2 years ago

Hello,

I'll be thankful if you can make a gist page with a collection of CoreFreq screenshots and outputs of your i9-9900K, using the fresh master version.

This page will join the Wiki CPU support

Merci

mxw39 commented 2 years ago

Certainly. I'll let you know when it is ready!