cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.94k stars 127 forks source link

[Alder Lake] RAM geometry detected incorrectly #414

Closed Technologicat closed 1 year ago

Technologicat commented 1 year ago

Hi,

I have a CLEVO PD70PNN1 laptop with 32GB of DDR5 RAM (2 x 16GB), but CoreFreq reports only a single 16GB DIMM in the DIMM geometry. However, the Total RAM is reported correctly (32GB).

Output of corefreq-cli -s -n -m -n -c 1 -n -k -n -B -n -M below.

If it matters, I'm running Linux Mint 21.

Is there any other information I can provide to help?

Processor                                 [12th Gen Intel(R) Core(TM) i7-12700H]
|- Architecture                                                   [Alder Lake/H]
|- Vendor ID                                                      [GenuineIntel]
|- Microcode                                                        [0x00000421]
|- Signature                                                           [  06_9A]
|- Stepping                                                            [      3]
|- Online CPU                                                          [ 20/ 20]
|- Base Clock                                                          [ 99.565]
|- Frequency            (MHz)                      Ratio                        
                 Min    398.33                    <   4 >                       
                 Max   2688.70                    <  27 >                       
|- Factory                                                             [100.000]
                       2700                       [  27 ]                       
|- Performance                                                                  
   |- P-State                                                                   
                 TGT   2190.79                    <  22 >                       
   |- HWP                                                                       
                 Min    597.49                    <   6 >                       
                 Max   3485.35                    <  35 >                       
                 TGT      AUTO                    <   0 >                       
|- Turbo Boost                                                         [ UNLOCK]
                  1C   4680.33                    <  47 >                       
                  2C   4680.33                    <  47 >                       
                  3C   4381.58                    <  44 >                       
                  4C   4381.58                    <  44 >                       
                  5C   4082.84                    <  41 >                       
                  6C   4082.84                    <  41 >                       
                  7C   4082.84                    <  41 >                       
                  8C   4082.84                    <  41 >                       
|- Hybrid                                                              [ UNLOCK]
                  1C   3484.83                    <  35 >                       
                  2C   3484.83                    <  35 >                       
                  3C   3484.83                    <  35 >                       
                  4C   3484.83                    <  35 >                       
                  5C   3285.70                    <  33 >                       
                  6C   3285.70                    <  33 >                       
                  7C   3285.70                    <  33 >                       
                  8C   3285.70                    <  33 >                       
|- Uncore                                                              [ UNLOCK]
                 Min    398.27                    <   4 >                       
                 Max   3982.67                    <  40 >                       
|- TDP                                                           Level <  0:3  >
   |- Programmable                                                     [ UNLOCK]
   |- Configuration                                                    [ UNLOCK]
   |- Turbo Activation                                                 [ UNLOCK]
             Nominal   2290.37                    [  23 ]                       
              Level1   1493.72                    [  15 ]                       
              Level2   2688.70                    [  27 ]                       
               Turbo   2190.79                    <  22 >                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [Y]          AES [Y]  AVX/AVX2 [Y/Y] 
|- AVX512-F     [N]    AVX512-DQ [N]  AVX512-IFMA [N]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [N]    AVX512-BW [N]   AVX512-VL [N] 
|- AVX512-VBMI  [N] AVX512-VBMI2 [N]  AVX512-VNNI [N]  AVX512-ALG [N] 
|- AVX512-VPOP  [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N] 
|- AVX512-BF16  [N] AVX-VNNI-VEX [Y]      MOVDIRI [Y]   MOVDIR64B [Y] 
|- BMI1/BMI2  [Y/Y]         CLWB [Y]      CLFLUSH [Y] CLFLUSH-OPT [Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [Y]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- ENQCMD       [N]         GFNI [Y]        OSPKE [Y]     WAITPKG [Y] 
|- MMX/Ext    [Y/N] MON/MWAITX [Y/N]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [Y]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/N]      SSE4.2 [Y] 
|- SERIALIZE    [Y]      SYSCALL [Y]        RDPID [Y]         SGX [N] 
|- VAES         [Y]   VPCLMULQDQ [Y]   PREFETCH/W [Y]       LZCNT [Y] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- APIC Timer Invariance                                        ARAT   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Missing]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Capable]
|- CPL Qualified Debug Store                                  DS-CPL   [Capable]
|- 64-Bit Debug Store                                         DTES64   [Capable]
|- Fast Short REP CMPSB                                         FSRC   [Missing]
|- Fast Short REP MOVSB                                         FSRM   [Capable]
|- Fast Short REP STOSB                                         FSRS   [Capable]
|- Fast Zero-length REP MOVSB                                   FZRM   [Missing]
|- Fast-String Operation                                        ERMS   [Capable]
|- Fused Multiply Add                                     FMA | FMA4   [Capable]
|- Hardware Feedback Interface                                   HFI   [Capable]
|- Hardware Lock Elision                                         HLE   [Missing]
|- History Reset                                              HRESET   [Capable]
|- Hybrid part processor                                      HYBRID   [Capable]
|- Instruction Based Sampling                                    IBS   [Missing]
|- Instruction INVPCID                                       INVPCID   [Capable]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- Linear Address Masking                                        LAM   [Missing]
|- LightWeight Profiling                                         LWP   [Missing]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Memory Protection Extensions                                  MPX   [Missing]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Capable]
|- Platform Configuration                                    PCONFIG   [Missing]
|- Process Context Identifiers                                  PCID   [Capable]
|- Perfmon and Debug Capability                                 PDCM   [Capable]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Write Data to a Processor Trace Packet                    PTWRITE   [Capable]
|- Resource Director Technology/PQE                            RDT-A   [Missing]
|- Resource Director Technology/PQM                            RDT-M   [Missing]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Capable]
|- Self-Snoop                                                     SS   [Capable]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Thread Director                                                TD   [Capable]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Capable]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Capable]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Capable]
|- Write Back & Do Not Invalidate Cache                     WBNOINVD   [Missing]
|- Extended xAPIC Support                                     x2APIC   [ x2APIC]
|- Execution Disable Bit Support                              XD-Bit   [Capable]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Capable]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [ Enable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [Capable]
|- Speculative Store Bypass Disable                             SSBD   [Capable]
|- Writeback & invalidate the L1 data cache                L1D-FLUSH   [Capable]
|- Hypervisor - No flush L1D on VM entry            L1DFL_VMENTRY_NO   [ Enable]
|- Arch - Buffer Overwriting                                MD-CLEAR   [Capable]
|- Arch - No Rogue Data Cache Load                           RDCL_NO   [ Enable]
|- Arch - Enhanced IBRS                                     IBRS_ALL   [ Enable]
|- Arch - Return Stack Buffer Alternate                         RSBA   [Capable]
|- Arch - No Speculative Store Bypass                         SSB_NO   [Capable]
|- Arch - No Microarchitectural Data Sampling                 MDS_NO   [ Enable]
|- Arch - No TSX Asynchronous Abort                           TAA_NO   [ Enable]
|- Arch - No Page Size Change MCE                     PSCHANGE_MC_NO   [ Enable]
|- Arch - STLB QoS                                              STLB   [ Enable]
|- Arch - Functional Safety Island                              FuSa   [ Enable]
|- Arch - RSM in CPL0 only                                       RSM   [Capable]
|- Arch - Split Locked Access Exception                         SPLA   [ Enable]
|- Arch - Snoop Filter QoS Mask                         SNOOP_FILTER   [ Enable]
|- Arch - No Fast Predictive Store Forwarding                   PSFD   [Capable]
|- Arch - Data Operand Independent Timing Mode                 DOITM   [Capable]
|- Arch - Not affected by SBDR or SSDP                  SBDR_SSDP_NO   [ Enable]
|- Arch - No Fill Buffer Stale Data Propagator              FBSDP_NO   [ Enable]
|- Arch - No Primary Stale Data Propagator                   PSDP_NO   [ Enable]
|- Arch - Overwrite Fill Buffer values                      FB_CLEAR   [Capable]
|- Arch - Special Register Buffer Data Sampling                SRBDS   [ Unable]
   |- RDRAND and RDSEED mitigation                             RNGDS   [ Unable]
   |- Restricted Transactional Memory                            RTM   [ Unable]
   |- Verify Segment for Writing instruction                    VERW   [ Unable]
|- Arch - Restricted RSB Alternate                             RRSBA   [ Enable]
|- Arch - No Branch Target Injection                          BHI_NO   [Capable]
|- Arch - Legacy xAPIC Disable                             XAPIC_DIS   [ Unable]
|- Arch - No Post-Barrier Return Stack Buffer               PBRSB_NO   [Capable]
|- Arch - IPRED disabled for CPL3                        IPRED_DIS_U   [Capable]
|- Arch - IPRED disabled for CPL0/1/2                    IPRED_DIS_S   [Capable]
|- Arch - RRSBA disabled for CPL3                        RRSBA_DIS_U   [Capable]
|- Arch - RRSBA disabled for CPL0/1/2                    RRSBA_DIS_S   [Capable]
|- Arch - BHI disabled for CPL0/1/2                        BHI_DIS_S   [Capable]
|- No MXCSR Configuration Dependent Timing                   MCDT_NO   [ Unable]
Security Features                                                               
|- CPUID Key Locker                                               KL   [Capable]
|- AES Key Locker instructions                                AESKLE   [Missing]
|- AES Wide Key Locker instructions                          WIDE_KL   [Capable]
|- Software Guard SGX1 Extensions                               SGX1   [Missing]
|- Software Guard SGX2 Extensions                               SGX2   [Missing]

Technologies                                                                    
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L2 Line Prefetcher                                        L2 HW CL   < ON>
|- System Management Mode                                       SMM-Dual   [ ON]
|- Hyper-Threading                                                   HTT   [ ON]
|- SpeedStep                                                        EIST   < ON>
|- Dynamic Acceleration                                              IDA   [ ON]
|- Turbo Boost Max 3.0                                             TURBO   < ON>
|- Energy Efficiency Optimization                                    EEO   < ON>
|- Race To Halt Optimization                                         R2H   <OFF>
|- Watchdog Timer                                                    TCO   <OFF>
|- Virtualization                                                    VMX   [ ON]
   |- I/O MMU                                                       VT-d   [ ON]
   |- Version                                                     [         4.0]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  5]
|- Counters:          General                   Fixed                           
|           {  6,  0,  0 } x 48 bits            3 x 48 bits                     
|- Enhanced Halt State                                           C1E       < ON>
|- C1 Auto Demotion                                              C1A       < ON>
|- C3 Auto Demotion                                              C3A       <OFF>
|- C1 UnDemotion                                                 C1U       < ON>
|- C3 UnDemotion                                                 C3U       <OFF>
|- C6 Core Demotion                                              CC6       <OFF>
|- C6 Module Demotion                                            MC6       <OFF>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Hardware Duty Cycling                                         HDC       [OFF]
|- Package C-States                                                             
   |- Configuration Control                                   CONFIG   [   LOCK]
   |- Lowest C-State                                           LIMIT   <     C0>
   |- I/O MWAIT Redirection                                  IOMWAIT   <Disable>
   |- Max C-State Inclusion                                    RANGE   <     C8>
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x1814]
|- ACPI Processor C-States                                      _CST   [      3]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     0     2     0     2     0     1     0     1              
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Last Level Cache Misses                                             [Capable]
|- Branch Instructions Retired                                         [Capable]
|- Branch Mispredicts Retired                                          [Capable]
|- Top-down slots Counter                                              [Capable]
|- Processor Performance Control                                _PCT   [ Enable]
|- Performance Supported States                                 _PSS   [      0]
|- Performance Present Capabilities                             _PPC   [      0]

Power, Current & Thermal                                                        
|- Temperature Offset:Junction                                 TjMax <  2:100 C>
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]
|- Power Management                                         PWR MGMT   [ UNLOCK]
   |- Energy Policy                                        Bias Hint   <      6>
   |- Energy Policy                                          HWP EPP   <    128>
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Capable]
|- Package Thermal Management                                    PTM   [Capable]
|- Thermal Monitor 1                                             TM1   [ Enable]
|- Thermal Monitor 2                                             TM2   [Capable]
|- Thermal Design Power                                          TDP   [   45 W]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   < Enable>
   |- Power Limit                                                PL1   <  115 W>
   |- Time Window                                                TW1   <  1m20s>
   |- Power Limit                                                PL2   <  125 W>
   |- Time Window                                                TW2   <   2 ms>
|- Thermal Design Power                                         Core   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                       Uncore   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                         DRAM   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                     Platform   < Enable>
   |- Power Limit                                                PL1   <  230 W>
   |- Time Window                                                TW1   <   28 s>
   |- Power Limit                                                PL2   <  245 W>
   |- Time Window                                                TW2   < 976 us>
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Core Thermal Point                                                           
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Package Thermal Point                                                        
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000061035]
   |- Window                                            second   [  0.000976562]

CPU Pkg  Apic  Core/Thread  Caches      (w)rite-Back (i)nclusive              
 #   ID   ID  Hybrid ID/ID L1-Inst Way  L1-Data Way      L2  Way      L3  Way 
000:BSP    0  P   1   0  0   32768  8     49152 12   1310720 10  25165824 12  
001:  0    1  P   1   0  1   32768  8     49152 12   1310720 10  25165824 12  
002:  0    8  P   1   4  0   32768  8     49152 12   1310720 10  25165824 12  
003:  0    9  P   1   4  1   32768  8     49152 12   1310720 10  25165824 12  
004:  0   16  P   1   8  0   32768  8     49152 12   1310720 10  25165824 12  
005:  0   17  P   1   8  1   32768  8     49152 12   1310720 10  25165824 12  
006:  0   24  P   1  12  0   32768  8     49152 12   1310720 10  25165824 12  
007:  0   25  P   1  12  1   32768  8     49152 12   1310720 10  25165824 12  
008:  0   32  P   1  16  0   32768  8     49152 12   1310720 10  25165824 12  
009:  0   33  P   1  16  1   32768  8     49152 12   1310720 10  25165824 12  
010:  0   40  P   1  20  0   32768  8     49152 12   1310720 10  25165824 12  
011:  0   41  P   1  20  1   32768  8     49152 12   1310720 10  25165824 12  
012:  0   48  E   1  24  0   65536  8     32768  8   2097152 16  25165824 12  
013:  0   50  E   1  25  0   65536  8     32768  8   2097152 16  25165824 12  
014:  0   52  E   1  26  0   65536  8     32768  8   2097152 16  25165824 12  
015:  0   54  E   1  27  0   65536  8     32768  8   2097152 16  25165824 12  
016:  0   56  E   1  28  0   65536  8     32768  8   2097152 16  25165824 12  
017:  0   58  E   1  29  0   65536  8     32768  8   2097152 16  25165824 12  
018:  0   60  E   1  30  0   65536  8     32768  8   2097152 16  25165824 12  
019:  0   62  E   1  31  0   65536  8     32768  8   2097152 16  25165824 12  

CPU Freq(MHz) Ratio  Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)  Min TMP:TS  Max
000  158.82 ( 1.59)   5.91  10.74   5.69   0.00  12.30  70.35  37 / 47:51 / 65
001    9.00 ( 0.09)   0.33   0.84   5.69   0.00  12.30  70.35  37 / 47:51 / 65
002   27.50 ( 0.28)   1.02   2.54   2.97   0.00   9.97  83.77  37 / 49:49 / 71
003    4.96 ( 0.05)   0.18   0.58   2.97   0.00   9.97  83.77  37 / 49:49 / 71
004   46.84 ( 0.47)   1.74   3.46   2.03   0.00   2.70  91.24  38 / 50:48 / 70
005    2.30 ( 0.02)   0.09   0.28   2.03   0.00   2.70  91.24  38 / 50:48 / 70
006   66.10 ( 0.66)   2.46   6.44   2.57   0.00  23.92  65.39  37 / 48:50 / 72
007   10.73 ( 0.11)   0.40   1.34   2.57   0.00  23.92  65.39  37 / 51:47 / 72
008   57.17 ( 0.57)   2.13   5.07   2.21   0.00  20.09  69.90  35 / 49:49 / 66
009   17.02 ( 0.17)   0.63   2.03   2.21   0.00  20.09  69.90  35 / 49:49 / 66
010   66.12 ( 0.66)   2.46   5.78   0.71   0.00  28.27  64.37  37 / 49:49 / 71
011    2.54 ( 0.03)   0.09   0.29   0.71   0.00  28.27  64.37  37 / 49:49 / 71
012   54.84 ( 0.55)   2.04   4.17   0.83   0.00  94.52   0.00  36 / 48:50 / 61
013   20.17 ( 0.20)   0.75   1.46   0.22   0.00  98.02   0.00  36 / 48:50 / 61
014   13.75 ( 0.14)   0.51   1.33   0.03   0.00  98.12   0.00  36 / 48:50 / 61
015   10.81 ( 0.11)   0.40   0.96   0.02   0.00  98.78   0.00  36 / 48:50 / 61
016    8.15 ( 0.08)   0.30   0.76   0.00   0.00  99.09   0.00  37 / 49:49 / 63
017    9.68 ( 0.10)   0.36   0.77   0.00   0.00  99.03   0.00  37 / 49:49 / 63
018    3.37 ( 0.03)   0.13   0.38   0.00   0.00  99.50   0.00  37 / 49:49 / 63
019    5.48 ( 0.05)   0.20   0.55   0.00   0.00  99.29   0.00  37 / 49:49 / 63

    Averages:        Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)    TjMax:    Pkg:
                      1.11   2.49   1.67   0.00  49.04  44.50     100 C    53 C

Linux:                                                                          
|- Release                                                   [5.15.0-60-generic]
|- Version                         [#66-Ubuntu SMP Fri Jan 20 14:29:49 UTC 2023]
|- Machine                                                              [x86_64]
Memory:                                                                         
|- Total RAM                                                         32571228 KB
|- Shared RAM                                                          716724 KB
|- Free RAM                                                          17632956 KB
|- Buffer RAM                                                          430008 KB
|- Total High                                                               0 KB
|- Free High                                                                0 KB
Clock Source                                                  <             tsc>
CPU-Freq driver                                               [    intel_pstate]
Governor                                                      [         Missing]
CPU-Idle driver                                               [      intel_idle]
|- Idle Limit                                                 [         C3_ACPI]
   |- State        POLL C1_ACPI C2_ACPI C3_ACPI                                 
   |-           CPUIDLE ACPI FF ACPI FF ACPI FF                                 
   |- Power          -1       0       0       0                                 
   |- Latency         0       1     127    1048                                 
   |- Residency       0       1     381    3144                                 

[ 0] INSYDE Corp.                                                               
[ 1] 1.07.02                                                                    
[ 2] 07/14/2022                                                                 
[ 3] Notebook                                                                   
[ 4] PD5x_7xPNP1_PNR1_PNN1_PNT1                                                 
[ 5] Not Applicable                                                             
[ 6] N---A---i---l-                                                             
[ 7] Not Applicable                                                             
[ 8] Not Applicable                                                             
[ 9] Notebook                                                                   
[10] PD5x_7xPNP1_PNR1_PNN1_PNT1                                                 
[11] Not Applicable                                                             
[12] N---A---i---l-                                                             
[13] Number Of Devices:2\Maximum Capacity:134217728 kilobytes                   
[14] Controller0-ChannelA-DIMM0\BANK 0                                          
[15] Controller1-ChannelA-DIMM0\BANK 0                                          
[16]                                                                            
[17]                                                                            
[18] Kingston                                                                   
[19] Kingston                                                                   
[20]                                                                            
[21]                                                                            
[22] KF548S38-16                                                                
[23] KF548S38-16                                                                
[24]                                                                            
[25]                                                                            

                            GenuineIntel  [   0]                           
Controller #0                                               Single Channel 
 Bus Rate  2300 MHz       Bus Speed 2290 MHz           DDR5 Speed 2389 MHz 

 Cha    CL RCDr RCDw   RP  RAS RRDs RRDl  FAW   WR RTPr WTPr  CWL  CKE  CMD
  #0    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0    12    8   14   14        18   18   20   22        70   52   12   12
      sgWW dgWW drWW ddWW                REFI  RFC  XS   XP CPDED GEAR  ECC
  #0    26    8   14   14                4680  383  706   18   12    2    0

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                                                  

Controller #1                                                    Disabled
cyring commented 1 year ago

Hello,

First CoreFreq report with Alder Lake/H I'm receiving, you have been lucky to get parts of the IMC. Indeed I will need your PCI Host Bridge or whatever MCH device id to trigger correctly an IMC decoder.

Please post here the full output of lspci -nn

Technologicat commented 1 year ago

Hi!

Here is lspci -nn:

00:00.0 Host bridge [0600]: Intel Corporation 12th Gen Core Processor Host Bridge/DRAM Registers [8086:4641] (rev 02)
00:01.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 [8086:460d] (rev 02)
00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c)
00:04.0 Signal processing controller [1180]: Intel Corporation Alder Lake Innovation Platform Framework Processor Participant [8086:461d] (rev 02)
00:06.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 [8086:464d] (rev 02)
00:06.2 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #2 [8086:463d] (rev 02)
00:07.0 PCI bridge [0604]: Intel Corporation Alder Lake-P Thunderbolt 4 PCI Express Root Port #0 [8086:466e] (rev 02)
00:08.0 System peripheral [0880]: Intel Corporation 12th Gen Core Processor Gaussian & Neural Accelerator [8086:464f] (rev 02)
00:0a.0 Signal processing controller [1180]: Intel Corporation Platform Monitoring Technology [8086:467d] (rev 01)
00:0d.0 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 USB Controller [8086:461e] (rev 02)
00:0d.2 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 NHI #0 [8086:463e] (rev 02)
00:14.0 USB controller [0c03]: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller [8086:51ed] (rev 01)
00:14.2 RAM memory [0500]: Intel Corporation Alder Lake PCH Shared SRAM [8086:51ef] (rev 01)
00:14.3 Network controller [0280]: Intel Corporation Alder Lake-P PCH CNVi WiFi [8086:51f0] (rev 01)
00:15.0 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 [8086:51e8] (rev 01)
00:15.1 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 [8086:51e9] (rev 01)
00:15.3 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #3 [8086:51eb] (rev 01)
00:16.0 Communication controller [0780]: Intel Corporation Alder Lake PCH HECI Controller [8086:51e0] (rev 01)
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:51bd] (rev 01)
00:1c.7 PCI bridge [0604]: Intel Corporation Alder Lake PCH-P PCI Express Root Port #9 [8086:51bf] (rev 01)
00:1f.0 ISA bridge [0601]: Intel Corporation Alder Lake PCH eSPI Controller [8086:5182] (rev 01)
00:1f.3 Audio device [0403]: Intel Corporation Alder Lake PCH-P High Definition Audio Controller [8086:51c8] (rev 01)
00:1f.4 SMBus [0c05]: Intel Corporation Alder Lake PCH-P SMBus Host Controller [8086:51a3] (rev 01)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Alder Lake-P PCH SPI Controller [8086:51a4] (rev 01)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [Geforce RTX 3070 Ti Laptop GPU] [10de:24a0] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)
03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a80c]
2d:00.0 SD Host controller [0805]: O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621] (rev 01)
2e:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
cyring commented 1 year ago

Hello,

Can you pull and try the new changes in the develop branch ?

I will need a refresh of corefreq-cli -s -n -M

Thank you

cyring commented 1 year ago

Btw generation 11th up to Raptor Lake can now experiment the SA voltage monitoring. You have to build this way:

make ARCH_PMC=PCU clean all

In the UI, switch to the Voltage view and check SA

2023-02-14-190553_642x410_scrot

Technologicat commented 1 year ago

Here's corefreq-cli -s -n -M on the same machine, using 1cd8f358f2a2e46fef11247b96f2dbdd9941ee0e from the develop branch.

Seems to detect all 32GB correctly now. Thanks!

Processor                                 [12th Gen Intel(R) Core(TM) i7-12700H]
|- Architecture                                                   [Alder Lake/H]
|- Vendor ID                                                      [GenuineIntel]
|- Microcode                                                        [0x00000421]
|- Signature                                                           [  06_9A]
|- Stepping                                                            [      3]
|- Online CPU                                                          [ 20/ 20]
|- Base Clock                                                          [ 99.557]
|- Frequency            (MHz)                      Ratio                        
                 Min    398.24                    <   4 >                       
                 Max   2688.12                    <  27 >                       
|- Factory                                                             [100.000]
                       2700                       [  27 ]                       
|- Performance                                                                  
   |- P-State                                                                   
                 TGT    597.36                    <   6 >                       
   |- HWP                                                                       
                 Min    597.36                    <   6 >                       
                 Max   2986.80                    <  30 >                       
                 TGT      AUTO                    <   0 >                       
|- Turbo Boost                                                         [ UNLOCK]
                  1C   4679.32                    <  47 >                       
                  2C   4679.32                    <  47 >                       
                  3C   4380.64                    <  44 >                       
                  4C   4380.64                    <  44 >                       
                  5C   4081.96                    <  41 >                       
                  6C   4081.96                    <  41 >                       
                  7C   4081.96                    <  41 >                       
                  8C   4081.96                    <  41 >                       
|- Hybrid                                                              [ UNLOCK]
                  1C   3484.77                    <  35 >                       
                  2C   3484.77                    <  35 >                       
                  3C   3484.77                    <  35 >                       
                  4C   3484.77                    <  35 >                       
                  5C   3285.64                    <  33 >                       
                  6C   3285.64                    <  33 >                       
                  7C   3285.64                    <  33 >                       
                  8C   3285.64                    <  33 >                       
|- Uncore                                                              [ UNLOCK]
                 Min    398.26                    <   4 >                       
                 Max   3982.60                    <  40 >                       
|- TDP                                                           Level <  0:3  >
   |- Programmable                                                     [ UNLOCK]
   |- Configuration                                                    [ UNLOCK]
   |- Turbo Activation                                                 [ UNLOCK]
             Nominal   2289.88                    [  23 ]                       
              Level1   1493.40                    [  15 ]                       
              Level2   2688.12                    [  27 ]                       
               Turbo   2190.32                    <  22 >                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [Y]          AES [Y]  AVX/AVX2 [Y/Y] 
|- AVX512-F     [N]    AVX512-DQ [N]  AVX512-IFMA [N]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [N]    AVX512-BW [N]   AVX512-VL [N] 
|- AVX512-VBMI  [N] AVX512-VBMI2 [N]  AVX512-VNNI [N]  AVX512-ALG [N] 
|- AVX512-VPOP  [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N] 
|- AVX512-BF16  [N] AVX-VNNI-VEX [Y]      MOVDIRI [Y]   MOVDIR64B [Y] 
|- BMI1/BMI2  [Y/Y]         CLWB [Y]      CLFLUSH [Y] CLFLUSH-OPT [Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [Y]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- ENQCMD       [N]         GFNI [Y]        OSPKE [Y]     WAITPKG [Y] 
|- MMX/Ext    [Y/N] MON/MWAITX [Y/N]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [Y]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/N]      SSE4.2 [Y] 
|- SERIALIZE    [Y]      SYSCALL [Y]        RDPID [Y]         SGX [N] 
|- VAES         [Y]   VPCLMULQDQ [Y]   PREFETCH/W [Y]       LZCNT [Y] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- APIC Timer Invariance                                        ARAT   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Missing]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Capable]
|- CPL Qualified Debug Store                                  DS-CPL   [Capable]
|- 64-Bit Debug Store                                         DTES64   [Capable]
|- Fast Short REP CMPSB                                         FSRC   [Missing]
|- Fast Short REP MOVSB                                         FSRM   [Capable]
|- Fast Short REP STOSB                                         FSRS   [Capable]
|- Fast Zero-length REP MOVSB                                   FZRM   [Missing]
|- Fast-String Operation                                        ERMS   [Capable]
|- Fused Multiply Add                                     FMA | FMA4   [Capable]
|- Hardware Feedback Interface                                   HFI   [Capable]
|- Hardware Lock Elision                                         HLE   [Missing]
|- History Reset                                              HRESET   [Capable]
|- Hybrid part processor                                      HYBRID   [Capable]
|- Instruction Based Sampling                                    IBS   [Missing]
|- Instruction INVPCID                                       INVPCID   [Capable]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- Linear Address Masking                                        LAM   [Missing]
|- LightWeight Profiling                                         LWP   [Missing]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Memory Protection Extensions                                  MPX   [Missing]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Capable]
|- Platform Configuration                                    PCONFIG   [Missing]
|- Process Context Identifiers                                  PCID   [Capable]
|- Perfmon and Debug Capability                                 PDCM   [Capable]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Write Data to a Processor Trace Packet                    PTWRITE   [Capable]
|- Resource Director Technology/PQE                            RDT-A   [Missing]
|- Resource Director Technology/PQM                            RDT-M   [Missing]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Capable]
|- Self-Snoop                                                     SS   [Capable]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Thread Director                                                TD   [Capable]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Capable]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Capable]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Capable]
|- Write Back & Do Not Invalidate Cache                     WBNOINVD   [Missing]
|- Extended xAPIC Support                                     x2APIC   [ x2APIC]
|- Execution Disable Bit Support                              XD-Bit   [Capable]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Capable]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [Capable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [Capable]
|- Speculative Store Bypass Disable                             SSBD   [Capable]
|- Writeback & invalidate the L1 data cache                L1D-FLUSH   [Capable]
|- Hypervisor - No flush L1D on VM entry            L1DFL_VMENTRY_NO   [ Enable]
|- Arch - Buffer Overwriting                                MD-CLEAR   [Capable]
|- Arch - No Rogue Data Cache Load                           RDCL_NO   [ Enable]
|- Arch - Enhanced IBRS                                     IBRS_ALL   [ Enable]
|- Arch - Return Stack Buffer Alternate                         RSBA   [Capable]
|- Arch - No Speculative Store Bypass                         SSB_NO   [Capable]
|- Arch - No Microarchitectural Data Sampling                 MDS_NO   [ Enable]
|- Arch - No TSX Asynchronous Abort                           TAA_NO   [ Enable]
|- Arch - No Page Size Change MCE                     PSCHANGE_MC_NO   [ Enable]
|- Arch - STLB QoS                                              STLB   [ Enable]
|- Arch - Functional Safety Island                              FuSa   [ Enable]
|- Arch - RSM in CPL0 only                                       RSM   [ Enable]
|- Arch - Split Locked Access Exception                         SPLA   [ Enable]
|- Arch - Snoop Filter QoS Mask                         SNOOP_FILTER   [ Enable]
|- Arch - No Fast Predictive Store Forwarding                   PSFD   [Capable]
|- Arch - Data Operand Independent Timing Mode                 DOITM   [Capable]
|- Arch - Not affected by SBDR or SSDP                  SBDR_SSDP_NO   [ Enable]
|- Arch - No Fill Buffer Stale Data Propagator              FBSDP_NO   [ Enable]
|- Arch - No Primary Stale Data Propagator                   PSDP_NO   [ Enable]
|- Arch - Overwrite Fill Buffer values                      FB_CLEAR   [Capable]
|- Arch - Special Register Buffer Data Sampling                SRBDS   [ Unable]
   |- RDRAND and RDSEED mitigation                             RNGDS   [ Unable]
   |- Restricted Transactional Memory                            RTM   [ Unable]
   |- Verify Segment for Writing instruction                    VERW   [ Unable]
|- Arch - Restricted RSB Alternate                             RRSBA   [ Enable]
|- Arch - No Branch Target Injection                          BHI_NO   [Capable]
|- Arch - Legacy xAPIC Disable                             XAPIC_DIS   [ Unable]
|- Arch - No Post-Barrier Return Stack Buffer               PBRSB_NO   [Capable]
|- Arch - IPRED disabled for CPL3                        IPRED_DIS_U   [Capable]
|- Arch - IPRED disabled for CPL0/1/2                    IPRED_DIS_S   [Capable]
|- Arch - RRSBA disabled for CPL3                        RRSBA_DIS_U   [Capable]
|- Arch - RRSBA disabled for CPL0/1/2                    RRSBA_DIS_S   [Capable]
|- Arch - BHI disabled for CPL0/1/2                        BHI_DIS_S   [Capable]
|- No MXCSR Configuration Dependent Timing                   MCDT_NO   [ Unable]
Security Features                                                               
|- CPUID Key Locker                                               KL   [Capable]
|- AES Key Locker instructions                                AESKLE   [Missing]
|- AES Wide Key Locker instructions                          WIDE_KL   [Capable]
|- Software Guard SGX1 Extensions                               SGX1   [Missing]
|- Software Guard SGX2 Extensions                               SGX2   [Missing]

Technologies                                                                    
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L2 Line Prefetcher                                        L2 HW CL   < ON>
|- System Management Mode                                       SMM-Dual   [ ON]
|- Hyper-Threading                                                   HTT   [ ON]
|- SpeedStep                                                        EIST   < ON>
|- Dynamic Acceleration                                              IDA   [ ON]
|- Turbo Boost Max 3.0                                             TURBO   < ON>
|- Energy Efficiency Optimization                                    EEO   < ON>
|- Race To Halt Optimization                                         R2H   <OFF>
|- Watchdog Timer                                                    TCO   <OFF>
|- Virtualization                                                    VMX   [ ON]
   |- I/O MMU                                                       VT-d   [ ON]
   |- Version                                                     [         4.0]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  5]
|- Counters:          General                   Fixed                           
|           {  6,  0,  0 } x 48 bits            3 x 48 bits                     
|- Enhanced Halt State                                           C1E       < ON>
|- C1 Auto Demotion                                              C1A       < ON>
|- C3 Auto Demotion                                              C3A       <OFF>
|- C1 UnDemotion                                                 C1U       < ON>
|- C3 UnDemotion                                                 C3U       <OFF>
|- C6 Core Demotion                                              CC6       <OFF>
|- C6 Module Demotion                                            MC6       <OFF>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Hardware Duty Cycling                                         HDC       [OFF]
|- Package C-States                                                             
   |- Configuration Control                                   CONFIG   [   LOCK]
   |- Lowest C-State                                           LIMIT   <     C0>
   |- I/O MWAIT Redirection                                  IOMWAIT   <Disable>
   |- Max C-State Inclusion                                    RANGE   <     C8>
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x1814]
|- ACPI Processor C-States                                      _CST   [      3]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     0     2     0     2     0     1     0     1              
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Last Level Cache Misses                                             [Capable]
|- Branch Instructions Retired                                         [Capable]
|- Branch Mispredicts Retired                                          [Capable]
|- Top-down slots Counter                                              [Capable]
|- Processor Performance Control                                _PCT   [ Enable]
|- Performance Supported States                                 _PSS   [      0]
|- Performance Present Capabilities                             _PPC   [      0]

Power, Current & Thermal                                                        
|- Temperature Offset:Junction                                 TjMax <  2:100 C>
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]
|- Power Management                                         PWR MGMT   [ UNLOCK]
   |- Energy Policy                                        Bias Hint   <      6>
   |- Energy Policy                                          HWP EPP   <    128>
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Capable]
|- Package Thermal Management                                    PTM   [Capable]
|- Thermal Monitor 1                                             TM1   [ Enable]
|- Thermal Monitor 2                                             TM2   [Capable]
|- Thermal Design Power                                          TDP   [   45 W]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   < Enable>
   |- Power Limit                                                PL1   <  200 W>
   |- Time Window                                                TW1   <  1m36s>
   |- Power Limit                                                PL2   <  125 W>
   |- Time Window                                                TW2   <   2 ms>
|- Thermal Design Power                                         Core   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                       Uncore   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                         DRAM   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                     Platform   < Enable>
   |- Power Limit                                                PL1   <  230 W>
   |- Time Window                                                TW1   <   28 s>
   |- Power Limit                                                PL2   <  245 W>
   |- Time Window                                                TW2   < 976 us>
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Core Thermal Point                                                           
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Package Thermal Point                                                        
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000061035]
   |- Window                                            second   [  0.000976562]

                          Intel ADL PCH-P  [5182]                          
Controller #0                                                Dual Channel  
 Bus Rate  1900 MHz       Bus Speed 1891 MHz           DDR5 Speed 2389 MHz 

 Cha    CL RCDr RCDw   RP  RAS RRDs RRDl  FAW   WR RTPr WTPr  CWL  CKE  CMD
  #0    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
  #1    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0    12    8   14   14        18   18   20   22        70   52   12   12
  #1    12    8   14   14        18   18   20   22        70   52   12   12
      sgWW dgWW drWW ddWW                REFI  RFC  XS   XP CPDED GEAR  ECC
  #0    26    8   14   14                4680  383  706   18   12    2    0
  #1    26    8   14   14                4680  383  706   18   12    2    0

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                                                  
cyring commented 1 year ago

Seems to detect all 32GB correctly now. Thanks!

It works but not as the way I was expecting it to do! I was expecting 2 controllers with one DIMM in each.

Bus rate has decreased to 1900 MHz with DDR5 about 2400 MHz. Does it sound good to you ?

Other addition for your platform is TCO which still stay OFF. Perhaps it is disabled in your BIOS ? You can however enable TCO from CoreFreq UI: please let me know if that works ?

Technologicat commented 1 year ago

EDIT: Sorry, I meant to say IBT (indirect branch tracking), not BTI. The kernel option is ibt=off, and having to do this is a known problem with the interaction of the Linux kernel with the nvidia proprietary driver on machines with Alder Lake CPUs.

Original post below.


Yes, the geometry seems weird.

As for the bus rate, I'll need to check what the BIOS reports, and post back later today.

I have to admit I'm not that familiar with processor internals. What is TCO in this context? I didn't find anything useful on the internet due to the overloaded acronym - even Intel themselves only talk about total cost of ownership. I'll look in the BIOS for a TCO setting.

Speaking of processor features, one detail I forgot to mention - I have ~bti=off~ ibt=off in the boot options for the kernel, because with ~BTI~ IBT enabled ~(branch target identification, I suppose?)~ (indirect branch tracking), the system crashes randomly with the current kernel (5.15.0.60-generic), at most within minutes of booting. Probably unrelated, but thought I should mention it just in case.

Also, for what it's worth, this is a new system, and there are still some other random issues I need to debug:

These seem unlikely to be CPU-related, so just for information :)

I'll get back to you later today after I've looked at the BIOS settings.

cyring commented 1 year ago

This definition of TCO and datasheet I'm using to program Registers Some Intel modules iTCO_wdt should already be started in your system. Without those drivers, CoreFreq can manage them, at first, to get the enablement state.

Technologicat commented 1 year ago

BIOS reports DRAM frequency as 4800 MHz. (I suppose this is including the "double" in the DDR.)

And after booting Linux again, with nothing changed, the memory bus is now at almost 4 GHz?! corefreq-cli -M:

                          Intel ADL PCH-P  [5182]                          
Controller #0                                                Dual Channel  
 Bus Rate  4000 MHz       Bus Speed 3982 MHz           DDR5 Speed 2389 MHz 

 Cha    CL RCDr RCDw   RP  RAS RRDs RRDl  FAW   WR RTPr WTPr  CWL  CKE  CMD
  #0    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
  #1    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0    12    8   14   14        18   18   20   22        70   52   12   12
  #1    12    8   14   14        18   18   20   22        70   52   12   12
      sgWW dgWW drWW ddWW                REFI  RFC  XS   XP CPDED GEAR  ECC
  #0    26    8   14   14                4680  383  706   18   12    2    0
  #1    26    8   14   14                4680  383  706   18   12    2    0

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1

Definition and datasheet - thank you, very interesting! Summarizing, in this context TCO is a low-level system crash watchdog, and the acronym indeed means total cost of ownership. No wonder it was hard to find. :)

The BIOS on this machine has no settings for TCO - or indeed that many settings at all. It is Insyde H2OBIOS version 1.07.02, with KBC/EC version 1.07.05, and ME FW version 16.0.15.1810.

Here's a list of available BIOS settings, other than TPM setup:

I'm not seeing any kernel modules with "TCO" in the name. Here is the full output of lsmod | sort:

ac97_bus               16384  1 snd_soc_core
acpi_pad              184320  0
acpi_thermal_rel       16384  1 int3400_thermal
aesni_intel           376832  6
af_alg                 32768  6 algif_hash,algif_skcipher
algif_hash             16384  1
algif_skcipher         16384  1
autofs4                49152  2
blake2b_generic        20480  0
bluetooth             704512  43 btrtl,btintel,btbcm,bnep,btusb,rfcomm
bnep                   28672  2
btbcm                  24576  1 btusb
btintel                40960  1 btusb
btrfs                1560576  0
btrtl                  24576  1 btusb
btusb                  61440  0
ccm                    20480  3
cec                    61440  2 drm_kms_helper,i915
cfg80211              974848  3 iwlmvm,iwlwifi,mac80211
clevo_acpi             20480  0
clevo_wmi              20480  0
cmac                   16384  3
coretemp               24576  0
cqhci                  36864  1 sdhci_pci
crc32_pclmul           16384  0
crct10dif_pclmul       16384  1
cryptd                 24576  3 crypto_simd,ghash_clmulni_intel
crypto_simd            16384  1 aesni_intel
dm_log                 20480  2 dm_region_hash,dm_mirror
dm_mirror              24576  0
dm_region_hash         24576  1 dm_mirror
drm                   622592  17 drm_kms_helper,nvidia,nvidia_drm,i915,ttm
drm_kms_helper        311296  2 nvidia_drm,i915
ecc                    36864  1 ecdh_generic
ecdh_generic           16384  2 bluetooth
efi_pstore             16384  0
fb_sys_fops            16384  1 drm_kms_helper
ghash_clmulni_intel    16384  0
hid                   151552  4 i2c_hid,usbhid,hid_multitouch,hid_generic
hid_generic            16384  0
hid_multitouch         32768  0
i2c_algo_bit           16384  1 i915
i2c_hid                36864  1 i2c_hid_acpi
i2c_hid_acpi           16384  0
i2c_i801               36864  0
i2c_smbus              20480  1 i2c_i801
i915                 3104768  28
icp                   323584  1 zfs
idma64                 20480  0
igen6_edac             24576  0
input_leds             16384  0
int3400_thermal        20480  0
int3403_thermal        20480  0
int340x_thermal_zone    20480  2 int3403_thermal,processor_thermal_device
intel_cstate           20480  0
intel_hid              24576  0
intel_lpss             16384  1 intel_lpss_pci
intel_lpss_pci         24576  0
intel_pmt              16384  0
intel_powerclamp       20480  0
intel_rapl_common      40960  2 intel_rapl_msr,processor_thermal_rapl
intel_rapl_msr         20480  0
intel_tcc_cooling      16384  0
ip_tables              32768  0
iwlmvm                569344  0
iwlwifi               450560  1 iwlmvm
joydev                 32768  0
kvm                  1028096  1 kvm_intel
kvm_intel             368640  0
ledtrig_audio          16384  2 snd_hda_codec_generic,snd_sof
libarc4                16384  1 mac80211
libcrc32c              16384  1 btrfs
lp                     28672  0
mac80211             1249280  1 iwlmvm
mac_hid                16384  0
mc                     65536  4 videodev,videobuf2_v4l2,uvcvideo,videobuf2_common
mei                   135168  3 mei_hdcp,mei_me
mei_hdcp               24576  0
mei_me                 40960  1
Module                  Size  Used by
msr                    16384  0
mxm_wmi                16384  0
nls_iso8859_1          16384  1
nvidia              56344576  114 nvidia_uvm,nvidia_modeset
nvidia_drm             69632  4
nvidia_modeset       1212416  3 nvidia_drm
nvidia_uvm           1363968  0
nvidia_wmi_ec_backlight    16384  0
nvme                   49152  2
nvme_core             135168  3 nvme
parport                69632  3 parport_pc,lp,ppdev
parport_pc             49152  0
pinctrl_tigerlake      32768  0
pmt_class              16384  1 pmt_telemetry
pmt_telemetry          16384  0
ppdev                  24576  0
processor_thermal_device    20480  1 processor_thermal_device_pci
processor_thermal_device_pci    16384  0
processor_thermal_mbox    16384  2 processor_thermal_rfim,processor_thermal_device
processor_thermal_rapl    20480  1 processor_thermal_device
processor_thermal_rfim    24576  1 processor_thermal_device
psmouse               176128  0
pstore_blk             16384  0
pstore_zone            32768  1 pstore_blk
r8169                 102400  0
raid6_pq              122880  1 btrfs
ramoops                32768  0
rapl                   20480  0
rc_core                65536  1 cec
realtek                32768  1
reed_solomon           28672  1 ramoops
rfcomm                 81920  16
sch_fq_codel           20480  2
sdhci                  81920  1 sdhci_pci
sdhci_pci              69632  0
serio_raw              20480  0
snd                   106496  23 snd_hda_codec_generic,snd_seq,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_timer,snd_compress,snd_soc_core,snd_pcm,snd_rawmidi
snd_compress           24576  1 snd_soc_core
snd_hda_codec         163840  5 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek,snd_soc_hdac_hda
snd_hda_codec_generic   102400  1 snd_hda_codec_realtek
snd_hda_codec_hdmi     77824  2
snd_hda_codec_realtek   159744  1
snd_hda_core          110592  9 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_ext_core,snd_hda_codec,snd_hda_codec_realtek,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_sof_intel_hda
snd_hda_ext_core       32768  3 snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_sof_intel_hda
snd_hda_intel          53248  5
snd_hwdep              16384  1 snd_hda_codec
snd_intel_dspcfg       28672  2 snd_hda_intel,snd_sof_intel_hda_common
snd_intel_sdw_acpi     20480  2 snd_sof_intel_hda_common,snd_intel_dspcfg
snd_pcm               143360  10 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_compress,snd_soc_core,snd_hda_core,snd_pcm_dmaengine
snd_pcm_dmaengine      16384  1 snd_soc_core
snd_rawmidi            49152  1 snd_seq_midi
snd_seq                77824  2 snd_seq_midi,snd_seq_midi_event
snd_seq_device         16384  3 snd_seq,snd_seq_midi,snd_rawmidi
snd_seq_midi           20480  0
snd_seq_midi_event     16384  1 snd_seq_midi
snd_soc_acpi           16384  2 snd_soc_acpi_intel_match,snd_sof_intel_hda_common
snd_soc_acpi_intel_match    61440  2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
snd_soc_core          339968  4 soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hda
snd_soc_hdac_hda       24576  1 snd_sof_intel_hda_common
snd_sof               147456  2 snd_sof_pci,snd_sof_intel_hda_common
snd_sof_intel_hda      20480  1 snd_sof_intel_hda_common
snd_sof_intel_hda_common   102400  1 snd_sof_pci_intel_tgl
snd_sof_pci            20480  2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
snd_sof_pci_intel_tgl    16384  0
snd_sof_xtensa_dsp     16384  1 snd_sof_intel_hda_common
snd_timer              40960  2 snd_seq,snd_pcm
soundcore              16384  1 snd
soundwire_bus          94208  3 soundwire_intel,soundwire_generic_allocation,soundwire_cadence
soundwire_cadence      36864  1 soundwire_intel
soundwire_generic_allocation    16384  1 soundwire_intel
soundwire_intel        40960  1 snd_sof_intel_hda_common
sparse_keymap          16384  2 intel_hid,tuxedo_keyboard
spl                   118784  6 zfs,icp,zzstd,znvpair,zcommon,zavl
syscopyarea            16384  1 drm_kms_helper
sysfillrect            20480  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
thunderbolt           319488  0
ttm                    86016  1 i915
tuxedo_io              24576  0
tuxedo_keyboard        49152  3 clevo_acpi,tuxedo_io,clevo_wmi
typec                  69632  1 typec_ucsi
typec_ucsi             45056  1 ucsi_acpi
ucsi_acpi              16384  0
usbhid                 65536  0
uvcvideo              106496  0
video                  65536  1 i915
videobuf2_common       77824  4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops
videobuf2_memops       20480  1 videobuf2_vmalloc
videobuf2_v4l2         32768  1 uvcvideo
videobuf2_vmalloc      20480  1 uvcvideo
videodev              258048  3 videobuf2_v4l2,uvcvideo,videobuf2_common
wmi                    32768  3 nvidia_wmi_ec_backlight,clevo_wmi,mxm_wmi
x86_pkg_temp_thermal    20480  0
xhci_pci               24576  0
xhci_pci_renesas       20480  1 xhci_pci
xor                    24576  1 btrfs
x_tables               53248  1 ip_tables
zavl                   20480  1 zfs
zcommon               106496  2 zfs,icp
zfs                  3821568  6
zlua                  163840  1 zfs
znvpair                98304  2 zfs,zcommon
zstd_compress         229376  1 btrfs
zunicode              348160  1 zfs
zzstd                 491520  1 zfs
Technologicat commented 1 year ago

Ah, and I tested enabling TCO in CoreFreq, in WindowTechnologies .

Here's corefreq-cli -s after doing that. It now says that Watchdog Timer (TCO) is ON.

Processor                                 [12th Gen Intel(R) Core(TM) i7-12700H]
|- Architecture                                                   [Alder Lake/H]
|- Vendor ID                                                      [GenuineIntel]
|- Microcode                                                        [0x00000421]
|- Signature                                                           [  06_9A]
|- Stepping                                                            [      3]
|- Online CPU                                                          [ 20/ 20]
|- Base Clock                                                          [ 99.557]
|- Frequency            (MHz)                      Ratio                        
                 Min    398.23                    <   4 >                       
                 Max   2688.03                    <  27 >                       
|- Factory                                                             [100.000]
                       2700                       [  27 ]                       
|- Performance                                                                  
   |- P-State                                                                   
                 TGT    597.34                    <   6 >                       
   |- HWP                                                                       
                 Min    597.34                    <   6 >                       
                 Max   2986.70                    <  30 >                       
                 TGT      AUTO                    <   0 >                       
|- Turbo Boost                                                         [ UNLOCK]
                  1C   4679.16                    <  47 >                       
                  2C   4679.16                    <  47 >                       
                  3C   4380.49                    <  44 >                       
                  4C   4380.49                    <  44 >                       
                  5C   4081.82                    <  41 >                       
                  6C   4081.82                    <  41 >                       
                  7C   4081.82                    <  41 >                       
                  8C   4081.82                    <  41 >                       
|- Hybrid                                                              [ UNLOCK]
                  1C   3484.22                    <  35 >                       
                  2C   3484.22                    <  35 >                       
                  3C   3484.22                    <  35 >                       
                  4C   3484.22                    <  35 >                       
                  5C   3285.12                    <  33 >                       
                  6C   3285.12                    <  33 >                       
                  7C   3285.12                    <  33 >                       
                  8C   3285.12                    <  33 >                       
|- Uncore                                                              [ UNLOCK]
                 Min    398.20                    <   4 >                       
                 Max   3981.96                    <  40 >                       
|- TDP                                                           Level <  0:3  >
   |- Programmable                                                     [ UNLOCK]
   |- Configuration                                                    [ UNLOCK]
   |- Turbo Activation                                                 [ UNLOCK]
             Nominal   2289.80                    [  23 ]                       
              Level1   1493.35                    [  15 ]                       
              Level2   2688.03                    [  27 ]                       
               Turbo   2190.24                    <  22 >                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [Y]          AES [Y]  AVX/AVX2 [Y/Y] 
|- AVX512-F     [N]    AVX512-DQ [N]  AVX512-IFMA [N]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [N]    AVX512-BW [N]   AVX512-VL [N] 
|- AVX512-VBMI  [N] AVX512-VBMI2 [N]  AVX512-VNNI [N]  AVX512-ALG [N] 
|- AVX512-VPOP  [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N] 
|- AVX512-BF16  [N] AVX-VNNI-VEX [Y]      MOVDIRI [Y]   MOVDIR64B [Y] 
|- BMI1/BMI2  [Y/Y]         CLWB [Y]      CLFLUSH [Y] CLFLUSH-OPT [Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [Y]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- ENQCMD       [N]         GFNI [Y]        OSPKE [Y]     WAITPKG [Y] 
|- MMX/Ext    [Y/N] MON/MWAITX [Y/N]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [Y]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/N]      SSE4.2 [Y] 
|- SERIALIZE    [Y]      SYSCALL [Y]        RDPID [Y]         SGX [N] 
|- VAES         [Y]   VPCLMULQDQ [Y]   PREFETCH/W [Y]       LZCNT [Y] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- APIC Timer Invariance                                        ARAT   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Missing]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Capable]
|- CPL Qualified Debug Store                                  DS-CPL   [Capable]
|- 64-Bit Debug Store                                         DTES64   [Capable]
|- Fast Short REP CMPSB                                         FSRC   [Missing]
|- Fast Short REP MOVSB                                         FSRM   [Capable]
|- Fast Short REP STOSB                                         FSRS   [Capable]
|- Fast Zero-length REP MOVSB                                   FZRM   [Missing]
|- Fast-String Operation                                        ERMS   [Capable]
|- Fused Multiply Add                                     FMA | FMA4   [Capable]
|- Hardware Feedback Interface                                   HFI   [Capable]
|- Hardware Lock Elision                                         HLE   [Missing]
|- History Reset                                              HRESET   [Capable]
|- Hybrid part processor                                      HYBRID   [Capable]
|- Instruction Based Sampling                                    IBS   [Missing]
|- Instruction INVPCID                                       INVPCID   [Capable]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- Linear Address Masking                                        LAM   [Missing]
|- LightWeight Profiling                                         LWP   [Missing]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Memory Protection Extensions                                  MPX   [Missing]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Capable]
|- Platform Configuration                                    PCONFIG   [Missing]
|- Process Context Identifiers                                  PCID   [Capable]
|- Perfmon and Debug Capability                                 PDCM   [Capable]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Write Data to a Processor Trace Packet                    PTWRITE   [Capable]
|- Resource Director Technology/PQE                            RDT-A   [Missing]
|- Resource Director Technology/PQM                            RDT-M   [Missing]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Capable]
|- Self-Snoop                                                     SS   [Capable]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Thread Director                                                TD   [Capable]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Capable]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Capable]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Capable]
|- Write Back & Do Not Invalidate Cache                     WBNOINVD   [Missing]
|- Extended xAPIC Support                                     x2APIC   [ x2APIC]
|- Execution Disable Bit Support                              XD-Bit   [Capable]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Capable]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [ Enable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [Capable]
|- Speculative Store Bypass Disable                             SSBD   [Capable]
|- Writeback & invalidate the L1 data cache                L1D-FLUSH   [Capable]
|- Hypervisor - No flush L1D on VM entry            L1DFL_VMENTRY_NO   [ Enable]
|- Arch - Buffer Overwriting                                MD-CLEAR   [Capable]
|- Arch - No Rogue Data Cache Load                           RDCL_NO   [ Enable]
|- Arch - Enhanced IBRS                                     IBRS_ALL   [ Enable]
|- Arch - Return Stack Buffer Alternate                         RSBA   [Capable]
|- Arch - No Speculative Store Bypass                         SSB_NO   [Capable]
|- Arch - No Microarchitectural Data Sampling                 MDS_NO   [ Enable]
|- Arch - No TSX Asynchronous Abort                           TAA_NO   [ Enable]
|- Arch - No Page Size Change MCE                     PSCHANGE_MC_NO   [ Enable]
|- Arch - STLB QoS                                              STLB   [ Enable]
|- Arch - Functional Safety Island                              FuSa   [ Enable]
|- Arch - RSM in CPL0 only                                       RSM   [ Enable]
|- Arch - Split Locked Access Exception                         SPLA   [ Enable]
|- Arch - Snoop Filter QoS Mask                         SNOOP_FILTER   [ Enable]
|- Arch - No Fast Predictive Store Forwarding                   PSFD   [Capable]
|- Arch - Data Operand Independent Timing Mode                 DOITM   [Capable]
|- Arch - Not affected by SBDR or SSDP                  SBDR_SSDP_NO   [ Enable]
|- Arch - No Fill Buffer Stale Data Propagator              FBSDP_NO   [ Enable]
|- Arch - No Primary Stale Data Propagator                   PSDP_NO   [ Enable]
|- Arch - Overwrite Fill Buffer values                      FB_CLEAR   [Capable]
|- Arch - Special Register Buffer Data Sampling                SRBDS   [ Unable]
   |- RDRAND and RDSEED mitigation                             RNGDS   [ Unable]
   |- Restricted Transactional Memory                            RTM   [ Unable]
   |- Verify Segment for Writing instruction                    VERW   [ Unable]
|- Arch - Restricted RSB Alternate                             RRSBA   [ Enable]
|- Arch - No Branch Target Injection                          BHI_NO   [Capable]
|- Arch - Legacy xAPIC Disable                             XAPIC_DIS   [ Unable]
|- Arch - No Post-Barrier Return Stack Buffer               PBRSB_NO   [Capable]
|- Arch - IPRED disabled for CPL3                        IPRED_DIS_U   [Capable]
|- Arch - IPRED disabled for CPL0/1/2                    IPRED_DIS_S   [Capable]
|- Arch - RRSBA disabled for CPL3                        RRSBA_DIS_U   [Capable]
|- Arch - RRSBA disabled for CPL0/1/2                    RRSBA_DIS_S   [Capable]
|- Arch - BHI disabled for CPL0/1/2                        BHI_DIS_S   [Capable]
|- No MXCSR Configuration Dependent Timing                   MCDT_NO   [ Unable]
Security Features                                                               
|- CPUID Key Locker                                               KL   [Capable]
|- AES Key Locker instructions                                AESKLE   [Missing]
|- AES Wide Key Locker instructions                          WIDE_KL   [Capable]
|- Software Guard SGX1 Extensions                               SGX1   [Missing]
|- Software Guard SGX2 Extensions                               SGX2   [Missing]

Technologies                                                                    
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L2 Line Prefetcher                                        L2 HW CL   < ON>
|- System Management Mode                                       SMM-Dual   [ ON]
|- Hyper-Threading                                                   HTT   [ ON]
|- SpeedStep                                                        EIST   < ON>
|- Dynamic Acceleration                                              IDA   [ ON]
|- Turbo Boost Max 3.0                                             TURBO   < ON>
|- Energy Efficiency Optimization                                    EEO   < ON>
|- Race To Halt Optimization                                         R2H   <OFF>
|- Watchdog Timer                                                    TCO   < ON>
|- Virtualization                                                    VMX   [ ON]
   |- I/O MMU                                                       VT-d   [ ON]
   |- Version                                                     [         4.0]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  5]
|- Counters:          General                   Fixed                           
|           {  6,  0,  0 } x 48 bits            3 x 48 bits                     
|- Enhanced Halt State                                           C1E       < ON>
|- C1 Auto Demotion                                              C1A       < ON>
|- C3 Auto Demotion                                              C3A       <OFF>
|- C1 UnDemotion                                                 C1U       < ON>
|- C3 UnDemotion                                                 C3U       <OFF>
|- C6 Core Demotion                                              CC6       <OFF>
|- C6 Module Demotion                                            MC6       <OFF>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Hardware Duty Cycling                                         HDC       [OFF]
|- Package C-States                                                             
   |- Configuration Control                                   CONFIG   [   LOCK]
   |- Lowest C-State                                           LIMIT   <     C0>
   |- I/O MWAIT Redirection                                  IOMWAIT   <Disable>
   |- Max C-State Inclusion                                    RANGE   <     C8>
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x1814]
|- ACPI Processor C-States                                      _CST   [      3]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     0     2     0     2     0     1     0     1              
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Last Level Cache Misses                                             [Capable]
|- Branch Instructions Retired                                         [Capable]
|- Branch Mispredicts Retired                                          [Capable]
|- Top-down slots Counter                                              [Capable]
|- Processor Performance Control                                _PCT   [ Enable]
|- Performance Supported States                                 _PSS   [      0]
|- Performance Present Capabilities                             _PPC   [      0]

Power, Current & Thermal                                                        
|- Temperature Offset:Junction                                 TjMax <  2:100 C>
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]
|- Power Management                                         PWR MGMT   [ UNLOCK]
   |- Energy Policy                                        Bias Hint   <      6>
   |- Energy Policy                                          HWP EPP   <    128>
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Capable]
|- Package Thermal Management                                    PTM   [Capable]
|- Thermal Monitor 1                                             TM1   [ Enable]
|- Thermal Monitor 2                                             TM2   [Capable]
|- Thermal Design Power                                          TDP   [   45 W]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   < Enable>
   |- Power Limit                                                PL1   <  115 W>
   |- Time Window                                                TW1   <  1m20s>
   |- Power Limit                                                PL2   <  125 W>
   |- Time Window                                                TW2   <   2 ms>
|- Thermal Design Power                                         Core   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                       Uncore   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                         DRAM   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                     Platform   < Enable>
   |- Power Limit                                                PL1   <  230 W>
   |- Time Window                                                TW1   <   28 s>
   |- Power Limit                                                PL2   <  245 W>
   |- Time Window                                                TW2   < 976 us>
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Core Thermal Point                                                           
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Package Thermal Point                                                        
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Units                                                                        
   |- Power                                               watt   [  0.125000000]
   |- Energy                                             joule   [  0.000061035]
   |- Window                                            second   [  0.000976562]
Technologicat commented 1 year ago

Cause of DRAM frequency discrepancy possibly found.

After booting the system, the memory bus is at 4000 MHz. But if I suspend and resume, it drops to 1900 MHz.

Next, to figure out why this happens...

By the way, thanks for developing CoreFreq, which makes this kind of detailed analysis possible! I originally installed CoreFreq to be able to monitor individual core clock rates and temperatures, for performance profile tuning. :)

cyring commented 1 year ago

After booting the system, the memory bus is at 4000 MHz. But if I suspend and resume, it drops to 1900 MHz.

Thanks for your return. I use the Uncore clock ratio in Bus rate computation of Intel 11th and afterwards generations. Ratio may fluctuate under certain circumstances.

https://github.com/cyring/CoreFreq/blob/1cd8f358f2a2e46fef11247b96f2dbdd9941ee0e/intel_reg.h#L4544

Technologicat commented 1 year ago

Ok, good to know, thanks.

The weird thing is, once I suspend/resume, the reported bus rate stays around 1900 MHz all the way until the next boot - it doesn't fluctuate. Also, the 4000 MHz after a fresh boot remains stable all the way until I suspend/resume the machine, it doesn't fluctuate either.

It also doesn't matter whether the machine is stressed or not, so if the reading is accurate, whatever is happening, it doesn't seem a dynamic performance scaling issue.

I read about Alder Lake and XMP 3.0, but this BIOS has no settings for that, either, so all I have to go on regarding the memory bus rate is what CoreFreq tells me.

Or what other tools tell me - for comparison, I tried CPU-X, which says "Kingston KF548S38-16, 16384 MB @ 4800 MHz (SODIMM DDR5)" for each of the two RAM slots (this is after a suspend and resume).

I've run some test loads (AI training on GPU using a custom code built on TensorFlow; and an MPI-distributed FEM code on CPU, specifically a custom Navier-Stokes solver built on FEniCS). Performance for both seems normal after suspend/resume, but to be sure, I'll have to re-check with a fresh boot.

cyring commented 1 year ago

The weird thing is, once I suspend/resume, the reported bus rate stays around 1900 MHz all the way until the next boot - it doesn't fluctuate.

What is not trivial to read from source code is that CSR Registers like ADL_SA_Pll are processed once during CoreFreq start up It is possible than several load/unload of driver corefreqk.ko ends up with different UCLK_RATIO value readings.

cyring commented 1 year ago

I read about Alder Lake and XMP 3.0, but this BIOS has no settings for that, either, so all I have to go on regarding the memory bus rate is what CoreFreq tells me.

To read IMC data and up to third group of timings: HWiNFO and OCCT in the Windows world.

Memtest86+ as bare-metal may provide the primary group of timings in Alder Lake but also the IMC frequency.

Technologicat commented 1 year ago

What is not trivial to read from source code is that CSR Registers like ADL_SA_Pll are processed once during CoreFreq start up

Ah, thanks!

Restarting CoreFreq (service corefreqd stop; rmmod corefreqk; modprobe corefreqk; service corefreqd start) didn't affect the reading, though - still 1900 MHz.

Windows solutions are unfortunately not applicable, as this machine is Linux-only.

I can download and try a recent Memtest86+, though, to see what it reports on the bare metal. Must be a decade since I've last run that utility :)

cyring commented 1 year ago

Other settings, but same story in the past: I remember starting to observe disperancies after any S3 resume on Nehalem architecture where unlocked Turbo ratios were restored to original manufacturer frequencies and not to last BIOS choosen values.

Technologicat commented 1 year ago

Ok. This is getting weirder. After leaving the machine idle for half an hour:

                          Intel ADL PCH-P  [5182]
Controller #0                                                Dual Channel
 Bus Rate   400 MHz       Bus Speed  398 MHz           DDR5 Speed 2389 MHz

 Cha    CL RCDr RCDw   RP  RAS RRDs RRDl  FAW   WR RTPr WTPr  CWL  CKE  CMD
  #0    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
  #1    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0    12    8   14   14        18   18   20   22        70   52   12   12
  #1    12    8   14   14        18   18   20   22        70   52   12   12
      sgWW dgWW drWW ddWW                REFI  RFC  XS   XP CPDED GEAR  ECC
  #0    26    8   14   14                4680  383  706   18   12    2    0
  #1    26    8   14   14                4680  383  706   18   12    2    0

 DIMM Geometry for channel #0
      Slot Bank Rank     Rows   Columns    Memory Size (MB)
       #0    16    1    131072      1024          16384         KF548S38-16
       #1
 DIMM Geometry for channel #1
      Slot Bank Rank     Rows   Columns    Memory Size (MB)
       #0    16    1    131072      1024          16384         KF548S38-16
       #1

That in itself fair enough for power saving, as 400 MHz is the lowest supported frequency, at least on the CPU - but when did the clock rate switch happen?

I haven't seen the 4 GHz appear again. Other than that 400 MHz after a long idle break, the memory bus clock rate has been at 1900 MHz regardless of load.

Performance of my test loads after a fresh boot was pretty much identical to my earlier results after a suspend/resume cycle, which might hint that the memory bus ran at the same speed. Either that, or the loads are not memory intensive enough for the performance to differ. :)

In other news, upgraded kernel to 6.1.0-1003-tuxedo (from TUXEDO Computers, a CLEVO-based Linux laptop vendor), as I figured a newer kernel might have better Alder Lake support (although the scheduler recognizes the P and E cores just fine also in 5.15.0-60-generic from Linux Mint). The kernel change didn't affect the behavior of the memory bus as reported by CoreFreq (after a make clean, then make against the new kernel, sudo checkinstall, and sudo depmod).

Running the new kernel, I was able to run a session of Cyberpunk without crashing, but oddly enough, only from a fresh boot. If the machine has been suspended and resumed, the game still exhibits the same random crashing as before. But the crashes have been random enough that I can't be sure if the kernel change fixed them or not - need more testing.

To narrow down the cause for the crashes, I tried also the older 515 version of the NVIDIA drivers. No change other than no software TDP limiter support in nvidia-smi. Ended up reverting back to the newer 525. The TDP limiter is a cool feature to have, pun intended.

Haven't looked at the bare metal readings with MemTest86+ yet - I'll keep you posted.

cyring commented 1 year ago

EDIT: I will come back with another monitoring of those frequencies in about two weeks. So far, trying to make IMC features stable as requested bellow.

cyring commented 1 year ago

Hello,

You can pull master branch where the IMC Bus is now computed from BIOS PLL ratio. The resulted frequency has to be static.

Reading this Intel post, I believe that the Gear mode should be taken account but I'm not sure which registers are involved.

Can you please show the refresh of corefreq-cli -M

cyring commented 1 year ago

EDIT: DRAM Speed is now converted to MT/s unit.

This is what it looks like with this Tiger Lake:

2023-02-18-232511_644x354_scrot

Technologicat commented 1 year ago

Hi,

Nice.

Sure, here's corefreq-cli -M using the new version from master (2e293e5102cd6c85b3454902fdaf2c9659dfe017). If it matters, this is after a suspend/resume, and the Linux kernel is the 6.1.0-1003-tuxedo that I installed last week.

It's now reporting ~2.4 GHz with ~4.8 GT/s, as one would expect from this setup.

                          Intel ADL PCH-P  [5182]                          
Controller #0                                                Dual Channel  
 Bus Rate  2400 MHz       Bus Speed 2389 MHz           DDR5 Speed 4779 MT/s

 Cha    CL RCDr RCDw   RP  RAS RRDs RRDl  FAW   WR RTPr WTPr  CWL  CKE  CMD
  #0    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
  #1    38   38   38   38   69    8   12   32   75   17  115   36   18   2T
      sgRR dgRR drRR ddRR      sgRW dgRW drRW ddRW      sgWR dgWR drWR ddWR
  #0    12    8   14   14        18   18   20   22        70   52   12   12
  #1    12    8   14   14        18   18   20   22        70   52   12   12
      sgWW dgWW drWW ddWW                REFI  RFC  XS   XP CPDED GEAR  ECC
  #0    26    8   14   14                4680  383  706   18   12    2    0
  #1    26    8   14   14                4680  383  706   18   12    2    0

 DIMM Geometry for channel #0                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                                                  
 DIMM Geometry for channel #1                                              
      Slot Bank Rank     Rows   Columns    Memory Size (MB)                
       #0    16    1    131072      1024          16384         KF548S38-16
       #1                                            

Also, in other news - for debugging the random crashes (of this machine, not CoreFreq :) ), I'm now running on the discrete NVIDIA GPU only, to take Optimus out of the equation.

Disabling the integrated GPU seems to have solved a couple of the issues I was having with the machine:

Optimus has worked fine on other Intel/NVIDIA laptops I've used it on (4th and 10th gen i7), but new gen, new bugs, I suppose.

I'd have preferred to use Optimus to squeeze out the last bit of discrete GPU memory for AI and games, as well as to save a few watts when idle, but if disabling Optimus means the machine runs without issues, I'll do so for now, and check again with new driver and kernel versions in a year or two.

I still need to test with MemTest86+, to get us some numbers for the IMC from the bare metal. I'll get back to you on that as soon as I have the time.

Technologicat commented 1 year ago

Regarding gear mode, yeah, should be taken into account. Also, doesn't seem to be documented well. Out of curiosity, I took a look at the guide, and gear is not mentioned anywhere on any of the 5060 pages of the document, despite being up to date up to 13th gen. :)

I noticed that Tom's hardware mentioned that a tool called HWInfo64 can read the gear mode. I've never used that - probably a Windows thing - and I don't know whether the author wants to share how he did it or not, but maybe ask him?

EDIT: Ah, right, you mentioned HWInfo already :)

cyring commented 1 year ago

You rather read the datasheets of the 13th generation where gear is somehow specified. See my wiki for doc references.

Technologicat commented 1 year ago

Ah, thank you!

According to vol. 2 of the Raptor Lake datasheet, section Scheduler Configuration (SC_GS_CFG_0_0_0_MCHBAR) — Offset E088h (pp. 154-156 according to the printed page numbers), seems that we should be able to read the gear mode from the 64-bit MMIO register at SC_GS_CFG_0_0_0_MCHBAR plus offset E088h.

If bit 31 is set, the MC is in GEAR2, and if bit 15 is set, the MC is in GEAR4:

this register is used for Scheduler configuration

31
Gear2 Mode (GEAR2):
Indicate that MC is working in Gear-2 (Qclk is half the data transfer clock of the DRAM)

15
Gear4 Mode (GEAR4):
Indicate that MC is working in Gear4 (Qclk is quarter the data transfer clock of the DRAM)

On a side note, OS access to this register is R - only the BIOS has RW access. Which is probably a good thing. :P

For comparison, I also checked vol. 2 of the Alder Lake datasheet, same section (pp. 137-139), and the offset and the bit numbers are the same. So we should be able to read the gear the same way in both gen 12 and gen 13.

Do you want to have a go at implementing this (I can test it), or alternatively, care to point me at the relevant part of the source code so I can try?


EDIT: add the note this register is used for Scheduler configuration, as per the docs.

Technologicat commented 1 year ago

Hmm, there's also a 32-bit MMIO register at Memory Controller BIOS Request (MC_BIOS_REQ_0_0_0_MCHBAR_PCU) — Offset 5E00h (pp. 202-203 in Raptor Lake docs, pp. 184-185 in Alder Lake docs):

This register allows BIOS to request Memory Controller clock frequency.

13:12
Gear Type (GEAR):
0h: Gear1 (Default) - DDR bus clock is the same as QCLK
1h: Gear2 - DDR PHY bus clock is double of QCLK
2h: Gear4 - DDR PHY bus clock is quad of QCLK

And then there's a 32-bit MMIO register at Memory Controller BIOS Data (MC_BIOS_DATA_0_0_0_MCHBAR_PCU) — Offset 5E04h (pp. 203-204; respectively pp. 186-187):

Memory Controller Frequency information for BIOS, during MRC flow.
Reflects the last frequency requested in MC_BIOS_REQ_0_0_0_MCHBAR_PCU.
In case of Dual MRC for System Agent SpeedStep, the value will change according to
the MRC requests.
Post MRC will hold the last MRC request and not the current memory frequency.

13:12
Gear Type (GEAR):
0h: Gear1 (Default) - DDR bus clock is the same as QCLK
1h: Gear2 - DDR PHY bus clock is double of QCLK
2h: Gear4 - DDR PHY bus clock is quad of QCLK

OS has R access to both of these registers, too.

Right now, I don't know which of these is best - maybe try all of them and see what they report?

cyring commented 1 year ago

@Technologicat I'm far from lab for a week. You can however investigate those registers in driver at function Query_ADL_IMC (indeed same query for Tiger and Alder Lake)

You will peek the register value by adding its address offset to the remap base address

void Query_ADL_IMC(void __iomem *mchmap, unsigned short mc)
{   /*Source: 12th Generation Intel® Core Processor Datasheet Vol 2 */
    unsigned short cha;
    unsigned int value = 0;
    value = readl(mchmap+0x5E04);
    printk("Register=%x\n", value);

Next rebuild, reload driver and print kernel log to read the register output in hexadecimal.

make clean all
rmmod corefreqk
insmod ./corefreqk.ko
dmesg
Register=abcd1234
Technologicat commented 1 year ago

I'll also be away for a few days, so a short update for now:

Also, while strictly unrelated, but I've babbled so much about my setup in this thread that other users with a CLEVO PD5x_7xPNP1_PNR1_PNN1_PNT1 experiencing random system crashes will likely end up here from search engines. :)

(More related to RAM, in general, is the random fact that the VRAM on the GPU runs on a 7 GHz clock rate. I hadn't realized GDDR was that fast.)

So I'll report that I tried underclocking the GPU (cores as well as VRAM) by 10%. This did it - it seems the crashes are gone.

My hypothesis is that the crashing is likely caused by the infamous transient power spikes of the RTX 30xx GPU series, briefly overwhelming the power supply capabilities of the laptop. The power brick is 200W, but there is also the battery subsystem to consider. (I have no idea whether the power always passes through the battery subsystem on this model.)

Note the GPU TDP is 125W, and for the i7-12700H CPU, 45W. The rest of the system also needs some power. So if the GPU draws much more power even for a short while, the system may brown-out and crash.

Also note that the crashes occur even with the GPU sustained power draw limited to 80W (sudo nvidia-smi --power-limit=80).

So if you ended up here from a search engine, and are experiencing similar issues on a similar laptop, here's what I did:

As a concluding side note, another solution for troubleshooting GPU power issues, seen on the internet, is:

nvidia-settings -a "[gpu:0]/GpuPowerMizerMode=1"

This disables performance scaling of the GPU, leading to a more predictable power draw. Note that in this mode, the GPU will consume more power when idle.

You can also change the setting in the GUI, as well as see the current value of the setting, by running just nvidia-settings.

Also note that changing the PowerMizer mode has mostly been suggested for "the GPU has fallen off the bus" errors, not for random system crashes.

I tried on both Adaptive and Prefer Maximum Performance, and for me, there was no difference in system stability. So I left the PowerMizer setting on Auto, which at least on my system uses the adaptive mode.

Technologicat commented 1 year ago

Ok, I'm back. Here's the raw data from registers 0x5E00 and 0x5E04 (sampled both out of curiosity):

[40705.433107] Register 0x5E00 = 0x39b81118
[40705.433108] Register 0x5E04 = 0x39b81118

Or in binary, 0x39b81118 -> 0011 1001 1011 1000 0001 0001 0001 1000. Decoding this by the Alder Lake datasheet (pp. 186-187),

Grouping the bits:
[0][011 1][001 1011 100][0 00][01] [0001] [0001 1000]

31 Reserved
30:27 VDDQ_TX_ICCMAX = [011 1] = 7 [* 0.25 A] = 1.75 A.
26:17 VDDQ_TX_VOLTAGE = [001 1011 100] = 220 [* 5 mV] = 1.1 V.
16:14 Reserved
13:12 GEAR = [01] = Gear 2: DDR PHY bus clock is double of QCLK.
11:8 MC_PLL_REF = [0001] = MC frequency request for 100MHz Qclk granularity.
7:0 MC_PLL_RATIO = [0001 1000] = QCLK 24 [* MC_PLL_REF granularity, here 100 MHz] = 2.4 GHz.

As for VDDQ_TX_ICCMAX, is there a shift needed? It's only 4 bits, but the datasheet says the max allowed is 32 * 0.25 A = 8 A. Or does this value simply saturate if the peak current exceeds 15 steps i.e. 3.75 A?


In unrelated news, I spoke too soon - the crashes weren't gone yet, just occurred much less often. No crash in up to 2h of gaming, then boom. It's still very rare to get the system to crash with anything other than Cyberpunk, but the fact it has happened twice with other loads tells me it's not the game. But because the crash occurs the most often with it, this specific game is an ideal test load.

Trying the Unigine Valley GPU benchmark, I noticed that when power-limited to 80W, the GPU clock rate jumps around a lot. Reading this comment on undervolting got me thinking, and I recalled P ∝ V² f from TI's report on CMOS power consumption. So if a power limit is enabled, it might be useful to slow down the clock rate to match?

Indeed, locking the maximum GPU clock rate at 1.785 GHz * (80W / 125W) ~ 1.1 GHz, where 125W is the default TDP and 1.785 GHz the default GPU clock rate, the GPU at full load stays just under the power limit, and does not need to throttle the clock rate, according to monitoring via nvtop. No need to touch the VRAM clock rate, it can run at the default 7 GHz.

Tested yesterday. Two hours of idling in-game, and one hour of gaming (later, in another session).

So far stable.

(And as a note for other gamers who happen upon this, 80W / 1.1 GHz on the RTX 3070 Ti mobile gives about 30-40 FPS, which to me is perfectly playable in a role-playing game. The much lower fan noise level is well worth the FPS hit. Your mileage may vary.)

cyring commented 1 year ago

As for VDDQ_TX_ICCMAX, is there a shift needed? It's only 4 bits, but the datasheet says the max allowed is 32 * 0.25 A = 8 A. Or does this value simply saturate if the peak current exceeds 15 steps i.e. 3.75 A?

Perhaps, if not null (aka register value not equal 0xffffffff), we have to add one to the value before applying formula.

... I noticed that when power-limited to 80W, the GPU clock rate jumps around a lot.

If this can help, CoreFreq is monitoring some hardware event bits. In the UI, press capital H to open the HOT window and check the GFX events. Just before your benchmark, you can clear an event, selecting and pressing Enter on its name.

Technologicat commented 1 year ago

One important thing I forgot to mention: upon loading the module, CoreFreq read the IMC registers 8 times.

Only the first read gives any useful results.

On second and further reads, the value in both registers is 0xffffffff.

If this can help, CoreFreq is monitoring some hardware event bits. In the UI, press capital H to open the HOT window and check the GFX events. Just before your benchmark, you can clear an event, selecting and pressing Enter on its name.

Ah, thank you!

It's very nice that CoreFreq runs in a terminal, so I can SSH in from another machine and run it over the SSH session when a fullscreen benchmark is running :)


Also, for NVIDIA GPU tuning, I forgot to mention that nvidia-smi -q -d SUPPORTED_CLOCKS is much better than a basic nvidia-smi -q, since it explicitly lists all supported VRAM/GPU-core clock rate pairs - so you can get a compatible pair of max clock rates to plug into the --lock-gpu-clocks and --lock-memory-clocks options.

From the SUPPORTED_CLOCKS data, I noticed that this GPU actually has just four available clock rates for the VRAM: 7001 MHz, 6001 MHz, 810 MHz, and 405 MHz. (The last two are obviously for power saving when idle.)

(I don't know why the extra "1". My first thoughts were that either someone at NVIDIA likes 2001: A Space Odyssey, or has set things up well in advance for a reference to the old Over 9000! internet meme, once future VRAM gens hit 9 GHz. But more likely it's due to rounding.)

cyring commented 1 year ago

... CoreFreq read the IMC registers 8 times.

Is this on the same channel, same controller ?

Because driver loops over 8 possible controllers, 12 channels each, considering latest Zen architectures. https://github.com/cyring/CoreFreq/blob/34efe5d3e6c79e5fddf78350da6fe4b7de0a63a8/coretypes.h#L2000

Different controllers, different channels will certainly return the same timings but Registers may slightly changed. So it appears better to probe all of them when feasible. For example, my Ryzen 3950X where tRDWR is not the same on channel 0 and channel 1 2023-03-02-122652_613x378_scrot

cyring commented 1 year ago

Also, for NVIDIA GPU tuning

So forget my GFX events, those are for the Intel integrated iGPU.

Technologicat commented 1 year ago

Is this on the same channel, same controller ?

Thanks. Good catch. Printing the value of the mc parameter, too:

[54773.963777] corefreqk: Query_ADL_IMC: mc = 0, Register 0x5E00 = 0x39b81118
[54773.963778] corefreqk: Query_ADL_IMC: mc = 0, Register 0x5E04 = 0x39b81118
[54773.963858] corefreqk: Query_ADL_IMC: mc = 1, Register 0x5E00 = 0xffffffff
[54773.963859] corefreqk: Query_ADL_IMC: mc = 1, Register 0x5E04 = 0xffffffff
[54773.963971] corefreqk: Query_ADL_IMC: mc = 2, Register 0x5E00 = 0xffffffff
[54773.963982] corefreqk: Query_ADL_IMC: mc = 2, Register 0x5E04 = 0xffffffff
[54773.964108] corefreqk: Query_ADL_IMC: mc = 3, Register 0x5E00 = 0xffffffff
[54773.964120] corefreqk: Query_ADL_IMC: mc = 3, Register 0x5E04 = 0xffffffff
[54773.964234] corefreqk: Query_ADL_IMC: mc = 4, Register 0x5E00 = 0xffffffff
[54773.964245] corefreqk: Query_ADL_IMC: mc = 4, Register 0x5E04 = 0xffffffff
[54773.964357] corefreqk: Query_ADL_IMC: mc = 5, Register 0x5E00 = 0xffffffff
[54773.964370] corefreqk: Query_ADL_IMC: mc = 5, Register 0x5E04 = 0xffffffff
[54773.964448] corefreqk: Query_ADL_IMC: mc = 6, Register 0x5E00 = 0xffffffff
[54773.964470] corefreqk: Query_ADL_IMC: mc = 6, Register 0x5E04 = 0xffffffff
[54773.964579] corefreqk: Query_ADL_IMC: mc = 7, Register 0x5E00 = 0xffffffff
[54773.964599] corefreqk: Query_ADL_IMC: mc = 7, Register 0x5E04 = 0xffffffff

so once per controller only, as expected.

So forget my GFX events, those are for the Intel integrated iGPU.

Ah, right, good point. I have that disabled for now.

cyring commented 1 year ago

Original issue covered

Technologicat commented 1 year ago

Yes, working correctly. Thank you!

P.S. Noticed that corefreq-cli -M already reports GEAR for me (see the printout further above). So indeed, seems no further changes are needed.

Technologicat commented 1 year ago

P.P.S. Final note for other gamers: again, the crashes were not yet gone, just occurred less often.

The real culprit turned out to be the CPU - most likely, a compatibility issue either with the Linux kernel or with some specific software (such as Cyberpunk).

I disabled the e-cores, and haven't had a single crash since then.

The way I thought of testing this possibility was reading some anecdotal reports on the internet about game instability with e-cores, and about PC crashes in video transcoding with e-cores enabled. Also, I got a semi-reproducible crash by merging Stable Diffusion checkpoints, making this easier to test. When the crash happened, the hw monitors (nvtop, corefreq-cli, htop) showed 0% GPU usage and over 800% CPU usage. The thread allocation policy is one thread per p-core first, then the e-cores, then the hyperthreads of p-cores. So if all cores are enabled, and a single process takes more than 600% CPU, it will have some of its threads running on e-cores.

Note that the BIOS setting for legacy game compatibility mode does nothing in Linux; instead, use the features provided by Linux to turn off individual cores. An easy GUI way is to create a profile in TUXEDO Control Center, and in that profile, set the number of logical cores to 12. Then the system will use the p-cores and their hyperthreads, as can be confirmed by lstopo (from the hwloc package). Then enable this profile by default both on mains and on battery.