cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.97k stars 126 forks source link

AMD Ryzen 9950x freq incorrect #505

Open madoverlord40 opened 4 weeks ago

madoverlord40 commented 4 weeks ago

When single thread boosting occurs on 9950x, it goes all the way to 61 and seems to be off the chart while reporting 5676 in the upper left corner. I assume you have not yet added support for this cpu as it just released. I will keep using CoreFreq anyway hoping you will release an update soon to correct this.

Thanks.

cyring commented 4 weeks ago

@madoverlord40 Thank you for trying CoreFreq with your 9950X. This is indeed the first report I'm receiving and I will need a lot of traces to debug the state of progress.

Can you post the output of all options of the CLI. List them with option -h

At least:

corwfreq-cli -s

Like my 3950X


About frequency issue, I need a screenshot of the CLI. Select Tools in the Menu and choose CPU Select Select your best Core, you can press z to find it the CPPC table (if available) Your best Core should the max boosted frequency, please take a screenshot at that moment.

madoverlord40 commented 3 weeks ago
Processor                                  [AMD Ryzen 9 9950X 16-Core Processor]
|- Architecture                                             [Zen5/Granite Ridge]
|- Vendor ID                                                      [AuthenticAMD]
|- Microcode                                                        [0x0b40401a]
|- Signature                                                           [  BF_44]
|- Stepping                                                            [      0]
|- Online CPU                                                          [ 32/ 32]
|- Base Clock                                                          [ 70.487]
|- Frequency            (MHz)                      Ratio                        
                 Min   6202.86                    <  88 >                       
                 Max   4299.71                    <  61 >                       
|- Factory                                                             [100.000]
                       6100                       [  61 ]                       
|- Performance                                                                  
                 TGT   4299.71                    <  61 >                       
   |- CPPC                                                                      
                 Min   3453.87                    <  49 >                       
                 Max    352.44                    <   5 >                       
                 TGT      AUTO                    <   0 >                       
   |- Boost                                                            [ UNLOCK]
                 XFR   5286.53                    [  75 ]                       
                 CPB   5286.53                    [  75 ]                       
   |- P-State                                                                   
                 P1    6202.86                    <  88 >                       
|- Uncore                                                              [   LOCK]
                 CLK   1057.31                    [  15 ]                       
                 MEM   2114.61                    [  30 ]                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [Y]          AES [Y]  AVX/AVX2 [Y/Y] 
|- AVX512-F     [Y]    AVX512-DQ [Y]  AVX512-IFMA [Y]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [Y]    AVX512-BW [Y]   AVX512-VL [Y] 
|- AVX512-VBMI  [Y] AVX512-VBMI2 [Y]  AVX512-VNNI [Y]  AVX512-ALG [Y] 
|- AVX512-VPOP  [Y] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [Y] 
|- AVX512-BF16  [Y] AVX-VNNI-VEX [Y]    AVX-FP128 [N]   AVX-FP256 [N] 
|- BMI1/BMI2  [Y/Y]         CLWB [Y]      CLFLUSH [Y] CLFLUSH-OPT [Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [Y]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- MMX/Ext    [Y/Y] MON/MWAITX [Y/Y]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [Y]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/Y]      SSE4.2 [Y] 
|- SERIALIZE    [N]      SYSCALL [Y]        RDPID [Y]        UMIP [Y] 
|- VAES         [Y]   VPCLMULQDQ [Y]   PREFETCH/W [Y]       LZCNT [Y] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- 100 MHz multiplier Control                            100MHzSteps   [Missing]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- Advanced Virtual Interrupt Controller                        AVIC   [Missing]
|- APIC Timer Invariance                                        ARAT   [Capable]
|- LOCK prefix to read CR8                                    AltMov   [Capable]
|- Clear Zero Instruction                                     CLZERO   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Capable]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Collaborative Processor Performance Control                  CPPC   [Capable]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Missing]
|- CPL Qualified Debug Store                                  DS-CPL   [Missing]
|- 64-Bit Debug Store                                         DTES64   [Missing]
|- Fast Short REP MOVSB                                         FSRM   [Capable]
|- Fast-String Operation                                        ERMS   [Capable]
|- Fused Multiply Add                                           FMA4   [Missing]
|- Fused Multiply Add                                            FMA   [Capable]
|- Hardware Lock Elision                                         HLE   [Missing]
|- Hyper-Threading Technology                                    HTT   [Capable]
|- Hardware P-state control                                      HwP   [Capable]
|- Instruction Based Sampling                                    IBS   [Capable]
|- Instruction INVLPGB                                       INVLPGB   [Missing]
|- Instruction INVPCID                                       INVPCID   [Capable]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- LightWeight Profiling                                         LWP   [Missing]
|- Memory Bandwidth Enforcement                                  MBE   [Capable]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Instruction MCOMMIT                                       MCOMMIT   [Missing]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- No-Execute Page Protection                                     NX   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- OS Visible Work-around                                       OSVW   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Missing]
|- Process Context Identifiers                                  PCID   [Missing]
|- Perfmon and Debug Capability                                 PDCM   [Missing]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Resource Director Technology/PQE                            RDT-A   [Capable]
|- Resource Director Technology/PQM                            RDT-M   [Capable]
|- Read Processor Register at User level                       RDPRU   [Capable]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Missing]
|- Self-Snoop                                                     SS   [Missing]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Trailing Bit Manipulation                                     TBM   [Missing]
|- Translation Cache Extension                                   TCE   [Capable]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Missing]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Capable]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Missing]
|- Write Back & Do Not Invalidate Cache                     WBNOINVD   [Capable]
|- Extended xAPIC Support                                     x2APIC   [  xAPIC]
|- AVIC controller for x2APIC                                 x2AVIC   [Missing]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Missing]
|- Extended Operation Support                                    XOP   [Missing]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [Capable]
   |- IBRS Always-On preferred by processor                            [ Unable]
   |- IBRS preferred over software solution                            [Capable]
   |- IBRS provides same speculation limits                            [Capable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [ Enable]
|- Speculative Store Bypass Disable                             SSBD   [Capable]
   |- SSBD use VIRT_SPEC_CTRL register                                 [ Unable]
   |- SSBD not needed on this processor                                [ Unable]
|- No Branch Type Confusion                                   BTC_NO   [Capable]
|- BTC on Non-Branch instruction                            BTC-NOBR   [Capable]
|- Limited Early Redirect Window                            AGENPICK   [ Unable]
|- Arch - No Fast Predictive Store Forwarding                   PSFD   [Capable]
|- Arch - Enhanced Predictive Store Forwarding                  EPSF   [Capable]
|- Arch - Cross Processor Information Leak                XPROC_LEAK   [ Unable]
Security Features                                                               
|- CET Shadow Stack features                                  CET-SS   [Capable]
|- Secure Init and Jump with Attestation                      SKINIT   [Capable]
|- Secure Encrypted Virtualization                               SEV   [Missing]
|- SEV - Encrypted State                                      SEV-ES   [Missing]
|- SEV - Secure Nested Paging                                SEV-SNP   [Missing]
|- Guest Mode Execute Trap                                      GMET   [Capable]
|- Supervisor Shadow Stack                                       SSS   [Capable]
|- VM Permission Levels                                         VMPL   [Missing]
|- VMPL Supervisor Shadow Stack                             VMPL-SSS   [Missing]
|- Secure Memory Encryption                                      SME   [Capable]
|- Transparent SME                                              TSME   [ Enable]
|- Secure Multi-Key Memory Encryption                         SME-MK   [Missing]
|- DRAM Data Scrambling                                    Scrambler   [ Enable]

Technologies                                                                    
|- Instruction Cache Unit                                                       
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
|- Cache Prefetchers                                                            
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L1 Stride Prefetcher                                     L1 Stride   < ON>
   |- L1 Region Prefetcher                                     L1 Region   < ON>
   |- L1 Burst Prefetch Mode                                    L1 Burst   < ON>
   |- L2 Stream HW Prefetcher                                  L2 Stream   < ON>
   |- L2 Up/Down Prefetcher                                   L2 Up/Down   < ON>
|- System Management Mode                                       SMM-Lock   [ ON]
|- Simultaneous Multithreading                                       SMT   [ ON]
|- PowerNow!                                                         CnQ   [ ON]
|- Core C-States                                                     CCx   [ ON]
|- Core Performance Boost                                            CPB   < ON>
|- Watchdog Timer                                                    WDT   < ON>
|- Virtualization                                                    SVM   [ ON]
   |- I/O MMU                                                      AMD-V   [ ON]
   |- Version                                                     [         0.1]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  2]
|- Counters:          General                   Fixed                           
|           {  6,  4, 16 } x 48 bits            3 x 64 bits                     
|- Enhanced Halt State                                           C1E       <OFF>
|- C2 UnDemotion                                                 C2U       < ON>
|- C3 UnDemotion                                                 C3U       < ON>
|- Core C6 State                                                 CC6       < ON>
|- Package C6 State                                              PC6       < ON>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x413 ]
|- ACPI Processor C-States                                      _CST   [      3]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     1     2     0     0     0     0     0     0              
   |- Monitor-Mwait Extensions                                   EMX   [Capable]
   |- Interrupt Break-Event                                      IBE   [Capable]
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Global Time Stamp Counter                                           [Missing]
|- Data Fabric Performance Counter                                     [Capable]
|- Core Performance Counter                                            [Capable]
|- Processor Performance Control                                _PCT   [ Enable]
|- Performance Supported States                                 _PSS   [      2]
|- Performance Present Capabilities                             _PPC   [      0]
|- Continuous Performance Control                               _CPC   [Missing]

Power, Current & Thermal                                                        
|- Temperature Offset:Junction                                 TjMax [ 49: 95 C]
|- CPPC Energy Preference                                        EPP   <      0>
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Missing]
|- Package Thermal Management                                    PTM   [Missing]
|- Thermal Monitor 1                                             TTP   [ Enable]
|- Thermal Monitor 2                                             HTC   [ Enable]
|- Thermal Design Power                                          TDP   [Missing]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [   0 ns]
   |- Power Limit                                                PL2   [    0 W]
   |- Time Window                                                TW2   [   0 ns]
|- Thermal Design Power                                         Core   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [   0 ns]
|- Thermal Design Power                                       Uncore   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [   0 ns]
|- Thermal Design Power                                         DRAM   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [   0 ns]
|- Thermal Design Power                                     Platform   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [   0 ns]
   |- Power Limit                                                PL2   [    0 W]
   |- Time Window                                                TW2   [   0 ns]
|- Package Power Tracking                                        PPT   [Missing]
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Core Thermal Point                                                           
|- Package Thermal Point                                                        
   |- Thermal Monitor Trip                                     Limit   [  115 C]
   |- HTC Temperature Limit                                    Limit   [  127 C]
   |- HTC Temperature Hysteresis                           Threshold   [    2 C]
|- Units                                                                        
   |- Power                                               watt   [      Missing]
   |- Energy                                             joule   [  0.000015259]
   |- Window                                            second   [  0.000976562]
madoverlord40 commented 3 weeks ago

I also noticed that if i leave CoreFreq running idle, and i walk away for a while, when i come back, the system is locked up and i have to power off/on. It boots right back into linux mint, but i test this. I left the pc on for 1HR without the program running, then with the program running, less than 30min later mouse wont move. I See [missing] in a lot of places above so i am wondering if that is enough info for you. I will try to get you a screen shot of best core boost when i am able to catch it doing that.

cyring commented 3 weeks ago

The baseclock BCLK can be fixed to 100 MHz with a driver parameter

insmod build/corefreqk.ko AutoClock=0

Or (if installed)

modprobe corefreqk AutoClock=0

Apparently the baseclock is wrongly updated to ~ 70 MHz You can also try AutoClock=1 for a unique estimation and see if it makes a difference into the frequency issue.


Commonly motherboards mitigate electromigration with SPREAD SPECTRUM. I'm disabling such feature in BIOS onto both CPU and Chipset. As a result, the BXLK estimates closer to 100 MHz`


(tbc)

cyring commented 3 weeks ago

The frozen system issue could be due to an access conflict onto the SMU between CoreFreq and another Linux driver agent like k10temp, amd_pstate You have to unload them or blacklist them during Linux startup. It will help me if you can list all drivers with command lsmod

Also have a look to the Wiki/CoreFreq as the Clock Source, CPU Freq and CPU Idle driver


(tbc)

madoverlord40 commented 3 weeks ago

Im on a new install of Mint 22, all i have installed for additional drivers is LACT, if you are by chance not familiar with that, is a kernel driver level tool for my radeon 7900xtx. For some reason in linux the gpu fans wont kick on and use LACT to force them on. I can still get you an lsmod report if you still wish it, but i dont think it will help you. maybe LACT is conflicting with your software? Im going to go try out "insmod build/corefreqk.ko AutoClock=0" and see what the frequence does after that and ill report back.

madoverlord40 commented 3 weeks ago

Ok i just downloaded and ran passmarks performance test 11 for linux and ran a CPU benchmark. Here is a screen shot of what your software is reporting. Screenshot from 2024-08-18 00-13-36

Considering my CPU score was 70k i dont think those values are correct. Here are my scores if it helps: CPU Mark: 70841 Integer Math 261614 Million Operations/s Floating Point Math 158501 Million Operations/s Prime Numbers 342 Million Primes/s Sorting 100243 Thousand Strings/s Encryption 51867 MB/s Compression 963377 KB/s CPU Single Threaded 4750 Million Operations/s Physics 4230 Frames/s Extended Instructions (SSE) 74945 Million Matrices/s

I do have PBO max and curve optimizer -20 in the bios. I also used the autoclock=0 on the insmod.

madoverlord40 commented 3 weeks ago

Ok here i used AutoClock=1 during a CPU benchmark: Screenshot from 2024-08-18 00-34-59

madoverlord40 commented 3 weeks ago

Ok i just realized that passmarks performance test has it wrong too. Check this out:

PassMark PerformanceTest Linux (11.0.1002) AMD Ryzen 9 9950X 16-Core Processor (x86_64) 16 cores @ 8007 MHz | 62.4 GiB RAM Number of Processes: 32 | Test Iterations: 1 | Test Duration: Medium

Maybe the kernel is miss-reporting max speed?

madoverlord40 commented 3 weeks ago

Ok i just installed latest kernel from mainline kernels 6.10.5 and now passmark shows this: AMD Ryzen 9 9950X 16-Core Processor (x86_64) 16 cores @ 5752 MHz | 62.4 GiB RAM Number of Processes: 32 | Test Iterations: 1 | Test Duration: Medium

and your software shows this: Screenshot from 2024-08-18 00-48-53

madoverlord40 commented 3 weeks ago

I booted into windows 11 and ryzen master is showing this during passmark 11: image

cyring commented 3 weeks ago

We need to know what's the reference base clock is ? Aka BCLK, the baseclock has to be displayed in one of your UEFI/BIOS screen.

A 70 MHz rather than a usual 100 MHz is not impossible but that will be a premiere since a long time.

I would like also to see the TSC,UCC, URC counters. Press c for such view or corefreq-cli -c


Btw, the computed voltage Vcore is wrong. As shown by Ryzen Master, it has to peak at 1.3 V

Can you replace this line of source code ?

https://github.com/cyring/CoreFreq/blob/8ca842b164f196d6205a401d469c3274315227e0/x86_64/coretypes.h#L611

with this line:

#define VOLTAGE_FORMULA_AMD_1Ah VOLTAGE_FORMULA_AMD_ZEN4

Save file then rebuild CoreFreq

make -j clean
make -j

Make sure to fully unload CoreFreq, especially its driver (rmmod corefreqk), before restarting it for the voltage test in a single Core fully stressed which has to trigger the PBO max Vcore-frequency.

cyring commented 3 weeks ago

Btw for all your next screenshots please switch to the custom view, press y key. Or start with view selector option:

corefreq-cli -t custom
cyring commented 3 weeks ago

Also have you noticed that some CPU temperature of 96°C had exceeded the 95°C TjMax

cyring commented 3 weeks ago

Thanks @InstLatx64 MSR dump I see than the TSC seems decoralated from the base clock

------[ MSR Registers / Logical CPU #0 ]------

CPU Clock (Normal): 5760 MHz
CPU Clock (TSC): 4321 MHz
CPU Multiplier: 57.3x
4321 (MHz) / 57.3 = 75.4

Somehow as with arm processors, facial frequency is not the same as the TSC counter. The Window tools above may source the frequency from other sources; I bet the AMD PP tables.

madoverlord40 commented 3 weeks ago

The base clock in my bios is 100, you are correct. I didnt change that, however i am running EXPO at 6000, but i assume that has nothing to do with the base clock. As for the temp, yeah, i saw that it hit 96 and that should be impossible, i wonder however, im running PBO unlimited, might have pushed it past 95? I read somewhere one time that the sensors in the CPU only report up to 95 but the CPU could actually be above that if pegged out at max at the time and that some times other sensors can report higher. Just a thought, what do you think?

Ok ill go change the source code like you asked and ill report back. Thanks for your support!

cyring commented 3 weeks ago

other sensors can report higher. Just a thought, what do you think?

True, some are die sensors, others for the whole package. It's not that easy to guess which one is from without some Registers specifications.

Ok ill go change the source code like you asked and ill report back.

Great. Can't wait to see the results. Manufacturer always change the rules for their own reasons I believe...


Please also let me know about the accuracy of the measured Power in Watt. Just press the w key and do two screenshots; when system is idle and when it is fully stressed.

madoverlord40 commented 3 weeks ago

Ok so i removed the module, and recompiled with the changes below:

//#define VOLTAGE_FORMULA_AMD_1Ah VOLTAGE_FORMULA_AMD_19_61h
#define VOLTAGE_FORMULA_AMD_1Ah VOLTAGE_FORMULA_AMD_ZEN4

In header file coretypes.h

I started your software with corefreq-cli -t custom with the below screen shot during passmark single thread test:

Screenshot from 2024-08-18 10-40-26

Although i cant really tell if your software is not reporting correctly or if its just not hitting PBO max being at 5.4 single thread, i think its supposed to be 5.7 in single thread.

cyring commented 3 weeks ago

Ok so i removed the module, and recompiled with the changes

Thank you. To me, Vcore appears now in the range from a minimum 0.8 to a maximum 1.3 V And current voltage looks to be per cluster CCX or CCD. This could explain why a single stressed CPU voltage is repeated among its cluster.

The power measured per physical Core looks coherent. I'm astonished by the total of 217 W. 9950X is said to be an efficient processor.

Although i cant really tell if your software is not reporting correctly or if its just not hitting PBO max being at 5.4 single thread, i think its supposed to be 5.7 in single thread.

That's why I claimed to use my integrated CPU stress Tools. They have been especially designed to trigger the boosted frequency-ies


I can see that the UI is facing glitches. Those because of abnormal huge values within the absolute frequencies. So far I can't tell if regressions are part of this new architecture or the queried Registers have unpunished changes.


What about the memory controller ?

Regards CyrIng

madoverlord40 commented 3 weeks ago

./build/corefreq-cli -m topology.txt

madoverlord40 commented 3 weeks ago

./build/corefreq-cli -k -n -B -n -M memory.txt

cyring commented 3 weeks ago

./build/corefreq-cli -m topology.txt

Thank you. Looks good to me

madoverlord40 commented 3 weeks ago

./build/corefreq-cli -m topology.txt

Thank you. Looks good to me

Ok cool, so you have what you need to update your software for 9950x?

cyring commented 3 weeks ago

./build/corefreq-cli -k -n -B -n -M memory.txt

Thanks.


Btw, in case you are not familiar with Markdown to post the output data;

  1. Type 3 anti-quotes followed by carriage return
  2. Type or copy/past the output data
  3. Type again 3 anti-quotes
cyring commented 3 weeks ago

Ok cool, so you have what you need to update your software for 9950x?

Yes I have minor fixes to provide in next version

madoverlord40 commented 3 weeks ago

./build/corefreq-cli -k -n -B -n -M memory.txt

Thanks.

* Memory size is wrongly computed.
  I believe the rank count of `F5-6400J3239G32G`

* `4800 MHz` rather than `6400` : does it have to be fixed ?

* Can you tell if the timings are matching your BIOS values ?

Btw, in case you are not familiar with Markdown to post the output data;

1. Type 3 anti-quotes followed by carriage return

2. Type or copy/past the output data

3. Type again 3 anti-quotes

The ram is 6400 EXPO which will default to 4800 without EXPO on, but im running it at 6000 in bios, the ryzen sweet spot otherwise fabric clock ends up no longer 1:1:1 with mem clk and uclk. So it should be reading 3000 a channel.

cyring commented 3 weeks ago

... So it should be reading 3000 a channel.

The main issue is that the estimated BCLK of ~ 70 MHz is also messing with the DDR5 computed speed.

Can you please post the following:

corefreq-cli -c 3
madoverlord40 commented 3 weeks ago

./build/corefreq-cli -c 3

CPU Freq(MHz) Ratio  Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)  Min TMP:TS  Max
000   39.63 ( 0.56)   0.92   0.95  99.05   0.00   0.00   0.00  32 / 36:680/ 95
001  166.16 ( 2.36)   3.86   3.40  96.60   0.00   0.00   0.00  32 / 36:680/ 95
002   67.90 ( 0.96)   1.58   1.50  98.50   0.00   0.00   0.00  32 / 36:680/ 95
003   14.74 ( 0.21)   0.34   0.47  99.53   0.00   0.00   0.00  32 / 36:680/ 95
004   45.99 ( 0.65)   1.07   1.08  98.92   0.00   0.00   0.00  32 / 36:687/ 95
005   19.18 ( 0.27)   0.45   0.51  99.49   0.00   0.00   0.00  32 / 36:680/ 95
006   10.83 ( 0.15)   0.25   0.32  99.68   0.00   0.00   0.00  32 / 36:680/ 95
007   52.32 ( 0.74)   1.22   0.93  99.07   0.00   0.00   0.00  32 / 36:680/ 95
008    0.45 ( 0.01)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:684/ 94
009   67.07 ( 0.95)   1.56   1.26  98.74   0.00   0.00   0.00  32 / 36:684/ 94
010  117.92 ( 1.67)   2.74   2.96  97.04   0.00   0.00   0.00  32 / 36:684/ 94
011   45.76 ( 0.65)   1.06   0.90  99.10   0.00   0.00   0.00  32 / 36:684/ 94
012   27.62 ( 0.39)   0.64   0.74  99.26   0.00   0.00   0.00  32 / 37:688/ 94
013    2.57 ( 0.04)   0.06   0.07  99.93   0.00   0.00   0.00  32 / 36:684/ 94
014    0.43 ( 0.01)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
015    0.55 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:684/ 94
016   18.10 ( 0.26)   0.42   0.48  99.52   0.00   0.00   0.00  32 / 36:680/ 95
017   54.33 ( 0.77)   1.26   1.19  98.81   0.00   0.00   0.00  32 / 36:683/ 95
018   19.90 ( 0.28)   0.46   0.44  99.56   0.00   0.00   0.00  32 / 36:687/ 95
019    0.50 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
020   54.28 ( 0.77)   1.26   1.17  98.83   0.00   0.00   0.00  32 / 36:680/ 95
021    2.23 ( 0.03)   0.05   0.08  99.92   0.00   0.00   0.00  32 / 36:683/ 95
022    1.18 ( 0.02)   0.03   0.04  99.96   0.00   0.00   0.00  32 / 36:680/ 95
023    1.46 ( 0.02)   0.03   0.03  99.97   0.00   0.00   0.00  32 / 36:683/ 95
024   33.25 ( 0.47)   0.77   0.65  99.35   0.00   0.00   0.00  32 / 36:684/ 94
025   10.87 ( 0.15)   0.25   0.25  99.75   0.00   0.00   0.00  32 / 36:684/ 94
026    8.32 ( 0.12)   0.19   0.16  99.84   0.00   0.00   0.00  32 / 36:684/ 94
027   19.10 ( 0.27)   0.44   0.43  99.57   0.00   0.00   0.00  32 / 36:684/ 94
028    0.27 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:684/ 94
029   42.56 ( 0.60)   0.99   1.23  98.77   0.00   0.00   0.00  32 / 36:684/ 94
030    0.27 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:684/ 94
031    5.75 ( 0.08)   0.13   0.18  99.82   0.00   0.00   0.00  32 / 36:684/ 94

    Averages:        Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)    TjMax:    Pkg:
                      0.81   0.75  99.25   0.00   0.00   0.00      95 C    37 C

CPU Freq(MHz) Ratio  Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)  Min TMP:TS  Max
000   22.71 ( 0.32)   0.53   0.58  99.42   0.00   0.00   0.00  32 / 36:687/ 95
001  132.91 ( 1.89)   3.09   2.81  97.19   0.00   0.00   0.00  32 / 36:687/ 95
002   60.41 ( 0.86)   1.40   1.41  98.59   0.00   0.00   0.00  32 / 36:687/ 95
003    2.78 ( 0.04)   0.06   0.09  99.91   0.00   0.00   0.00  32 / 36:687/ 95
004   45.99 ( 0.65)   1.07   1.08  98.92   0.00   0.00   0.00  32 / 36:687/ 95
005    2.27 ( 0.03)   0.05   0.07  99.93   0.00   0.00   0.00  32 / 36:687/ 95
006   41.42 ( 0.59)   0.96   0.74  99.26   0.00   0.00   0.00  32 / 36:687/ 95
007    4.87 ( 0.07)   0.11   0.16  99.84   0.00   0.00   0.00  32 / 36:687/ 95
008    0.38 ( 0.01)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:684/ 94
009   40.41 ( 0.57)   0.94   0.84  99.16   0.00   0.00   0.00  32 / 36:684/ 94
010  117.92 ( 1.67)   2.74   2.96  97.04   0.00   0.00   0.00  32 / 36:684/ 94
011   87.98 ( 1.25)   2.05   2.35  97.65   0.00   0.00   0.00  32 / 36:684/ 94
012   27.62 ( 0.39)   0.64   0.74  99.26   0.00   0.00   0.00  32 / 37:688/ 94
013    6.28 ( 0.09)   0.15   0.14  99.86   0.00   0.00   0.00  32 / 37:688/ 94
014    0.43 ( 0.01)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
015    0.34 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
016   16.46 ( 0.23)   0.38   0.42  99.58   0.00   0.00   0.00  32 / 36:687/ 95
017   53.80 ( 0.76)   1.25   1.14  98.86   0.00   0.00   0.00  32 / 36:687/ 95
018   19.90 ( 0.28)   0.46   0.44  99.56   0.00   0.00   0.00  32 / 36:687/ 95
019    0.50 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
020   46.55 ( 0.66)   1.08   1.10  98.90   0.00   0.00   0.00  32 / 36:687/ 95
021    9.27 ( 0.13)   0.22   0.31  99.69   0.00   0.00   0.00  32 / 36:687/ 95
022    0.60 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
023    0.55 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
024   33.35 ( 0.47)   0.78   0.66  99.34   0.00   0.00   0.00  32 / 36:684/ 94
025   20.64 ( 0.29)   0.48   0.46  99.54   0.00   0.00   0.00  32 / 37:688/ 94
026    3.78 ( 0.05)   0.09   0.11  99.89   0.00   0.00   0.00  32 / 37:688/ 94
027   19.10 ( 0.27)   0.44   0.43  99.57   0.00   0.00   0.00  32 / 36:684/ 94
028    0.31 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:684/ 94
029   25.53 ( 0.36)   0.59   0.67  99.33   0.00   0.00   0.00  32 / 37:688/ 94
030    0.29 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
031    0.29 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94

    Averages:        Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)    TjMax:    Pkg:
                      0.61   0.62  99.38   0.00   0.00   0.00      95 C    36 C

CPU Freq(MHz) Ratio  Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)  Min TMP:TS  Max
000   19.59 ( 0.28)   0.46   0.49  99.51   0.00   0.00   0.00  32 / 36:683/ 95
001  115.28 ( 1.64)   2.68   2.60  97.40   0.00   0.00   0.00  32 / 36:683/ 95
002   50.17 ( 0.71)   1.17   1.28  98.72   0.00   0.00   0.00  32 / 36:683/ 95
003    2.83 ( 0.04)   0.07   0.09  99.91   0.00   0.00   0.00  32 / 36:683/ 95
004   48.27 ( 0.68)   1.12   1.27  98.73   0.00   0.00   0.00  32 / 36:683/ 95
005   25.91 ( 0.37)   0.60   0.72  99.28   0.00   0.00   0.00  32 / 36:683/ 95
006   59.88 ( 0.85)   1.39   1.05  98.95   0.00   0.00   0.00  32 / 36:683/ 95
007    0.81 ( 0.01)   0.02   0.03  99.97   0.00   0.00   0.00  32 / 36:683/ 95
008    0.26 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
009   31.21 ( 0.44)   0.73   0.64  99.36   0.00   0.00   0.00  32 / 37:688/ 94
010   20.91 ( 0.30)   0.49   0.42  99.58   0.00   0.00   0.00  32 / 37:688/ 94
011   26.93 ( 0.38)   0.63   0.82  99.18   0.00   0.00   0.00  32 / 37:688/ 94
012    3.65 ( 0.05)   0.08   0.11  99.89   0.00   0.00   0.00  32 / 37:688/ 94
013    6.55 ( 0.09)   0.15   0.16  99.84   0.00   0.00   0.00  32 / 37:688/ 94
014    4.36 ( 0.06)   0.10   0.14  99.86   0.00   0.00   0.00  32 / 37:688/ 94
015    0.30 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
016   11.19 ( 0.16)   0.26   0.31  99.69   0.00   0.00   0.00  32 / 36:683/ 95
017   53.36 ( 0.76)   1.24   1.37  98.63   0.00   0.00   0.00  32 / 36:683/ 95
018   12.66 ( 0.18)   0.29   0.34  99.66   0.00   0.00   0.00  32 / 36:683/ 95
019    0.26 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 36:683/ 95
020   42.75 ( 0.61)   0.99   1.04  98.96   0.00   0.00   0.00  32 / 36:683/ 95
021    2.86 ( 0.04)   0.07   0.09  99.91   0.00   0.00   0.00  32 / 36:683/ 95
022    0.88 ( 0.01)   0.02   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
023    0.57 ( 0.01)   0.01   0.02  99.98   0.00   0.00   0.00  32 / 36:687/ 95
024   31.07 ( 0.44)   0.72   0.62  99.38   0.00   0.00   0.00  32 / 37:688/ 94
025    5.65 ( 0.08)   0.13   0.13  99.87   0.00   0.00   0.00  32 / 37:688/ 94
026   11.89 ( 0.17)   0.28   0.30  99.70   0.00   0.00   0.00  32 / 37:688/ 94
027   10.26 ( 0.15)   0.24   0.33  99.67   0.00   0.00   0.00  32 / 37:688/ 94
028    0.25 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
029    0.25 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
030    0.23 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94
031    0.25 ( 0.00)   0.01   0.01  99.99   0.00   0.00   0.00  32 / 37:688/ 94

    Averages:        Turbo  C0(%)  C1(%)  C3(%)  C6(%)  C7(%)    TjMax:    Pkg:
                      0.44   0.45  99.55   0.00   0.00   0.00      95 C    37 C
madoverlord40 commented 3 weeks ago

Ok so using your turbo feature on core 5, ryzen master shows it to be a fast core with a white dot? Screenshot from 2024-08-18 18-26-36

madoverlord40 commented 3 weeks ago

As for matching ram timings, i checked in bios, and booted into windows 11, this is ryzen master shows which matches bios: image

madoverlord40 commented 3 weeks ago

Also while i was in windows 11, i ran Cinebench 23 to see what ryzen master would show for all core workload. You mentioned earlier 217w, but look at what ryzen master says the watts are at full load. image

Score if you are interested: image

madoverlord40 commented 3 weeks ago

Cinebench 23 runing single core, here is ryzen master: image

score if you are interested: image

madoverlord40 commented 3 weeks ago

Dont forget I have curve optimizer of -25, i am not sure how that affects the core voltage, but it should lower it quite a bit?

cyring commented 3 weeks ago

Ok so using your turbo feature on core 5, ryzen master shows it to be a fast core with a white dot?

Great! This time we are reaching the PBO frequency. Even with a quarter frequency above as XFR used to be.

I don't know how RM is marking the Core score from but CoreFreq decodes the CPPC table and translates a register into a max frequency capability per CPU which appears to tell the same as RM dots or stars. Press z to open the CPPC table window.

cyring commented 3 weeks ago

Also while i was in windows 11, i ran Cinebench 23 to see what ryzen master would show for all core workload. You mentioned earlier 217w, but look at what ryzen master says the watts are at full load.

RM displays for PPT a % of 1000W

Really!!!


Give a try to the Tools > Conics > Hyperboloid of two sheets" and see if9950X` is consuming more power than that.

cyring commented 3 weeks ago

score if you are interested: image

This is such an IPC improvement from AMD. Here is my 3950X score from CineBench v20 image

cyring commented 3 weeks ago

Factor from Ryzen 1700X

3950X 9950X
521/378 2287/959
x 1.38 x 2.38
cyring commented 3 weeks ago
  • F5-6400J3239G32G

Indeed I'm supposed to decode this DIMM geometry : Dual Rank

cyring commented 3 weeks ago

As for matching ram timings, i checked in bios, and booted into windows 11, this is ryzen master shows which matches bios: image

CoreFreq is matching on Pri, Sec, and Ter Timings

RFC1 RFC2 are differing because I just know how to decode default values

cyring commented 3 weeks ago

@madoverlord40

You can now pull commit 0e37a134988bb00db8a2cb1934eefb5b0737ef67 from develop branch

Can you please take this Core cycles view screenshot ?

2024-08-19-095050_644x940_scrot

madoverlord40 commented 3 weeks ago

Also while i was in windows 11, i ran Cinebench 23 to see what ryzen master would show for all core workload. You mentioned earlier 217w, but look at what ryzen master says the watts are at full load.

RM displays for PPT a % of 1000W

Really!!!

Thats what my mainboard does when i put it on PBO max, it sets the power limits to 1k.

madoverlord40 commented 3 weeks ago

Factor from Ryzen 1700X

3950X 9950X 521/378 2287/959 x 1.38 x 2.38

I also have an intel 14900kf setup and the 9950x is still not quite there. single core when pushed to 320w is still higher than 9950x PBO max at 230w. All-core however, the 9950x is slightly ahead of the 14900KS. Where the 9950x truely shines i think is the watt to freq comparison. The 9950x does the basically the same job as intel 14900k at 100watt less. However at stock settings, the 9950x wins, intel has to be pushed to 320+ watts and bang off 100c in order to outperform 9950x. Especially after that intel microcode update where they basically nerfed all P-cores on intel to 5.4 during all core workloads. But it seems they have to do that now in order to keep their chips from degrading.

madoverlord40 commented 3 weeks ago

@madoverlord40

You can now pull commit 0e37a13 from develop branch

Can you please take this Core cycles view screenshot ?

2024-08-19-095050_644x940_scrot

Ok. ill pull your latest from the development branch, compile and then show you the what the cycles are.

madoverlord40 commented 3 weeks ago

Ok here are the core cycles view, after compiling latest from develop. Screenshot from 2024-08-19 07-56-57

cyring commented 3 weeks ago

Ok here are the core cycles view, after compiling latest from develop. Screenshot from 2024-08-19 07-56-57

I was not expecting the TSC to be equal to the CPU Base Clock of 4.3 GHz I'm now back to square one and I don't see what's going with the BCLK estimation ?

madoverlord40 commented 3 weeks ago

Ok here are the core cycles view, after compiling latest from develop. Screenshot from 2024-08-19 07-56-57

I was not expecting the TSC to be equal to the CPU Base Clock of 4.3 GHz I'm now back to square one and I don't see what's going with the BCLK estimation ?

At this point i think its a kernel level issue. The kernel is probably not reporting the CPU correctly. Ryzen master in windows 11 shows BCLK at 100. When pass AutoClock=0 to your software it correctly shows a base clock of 100 but then thinks my max boost is 8ghz. So maybe you should adjust how the multipliers work using a base clock of 100 and when 9950x is detected. Maybe in the near future kernel release will report the CPU differently and make it easier for you.

cyring commented 3 weeks ago

At this point i think its a kernel level issue. The kernel is probably not reporting the CPU correctly.

The estimation is computed from a one second interval between two TSC collects. CoreFreq relies on kernel just for the delay; the TSC being collected directly onto CPU register.

A kernel issue would be its own clock source but you would have noticed it if one second is not accurate.

Can you however try the following build option and tell if it makes a difference ?

make -j clean
make -j DELAY_TSC=0

Also, just in case of a kernel issue, can you post the output of /proc/cpuinfo (first CPU will be enough)

madoverlord40 commented 3 weeks ago

At this point i think its a kernel level issue. The kernel is probably not reporting the CPU correctly.

The estimation is computed from a one second interval between two TSC collects. CoreFreq relies on kernel just for the delay; the TSC being collected directly onto CPU register.

A kernel issue would be its own clock source but you would have noticed it if one second is not accurate.

Can you however try the following build option and tell if it makes a difference ?

make -j clean
make -j DELAY_TSC=0

Also, just in case of a kernel issue, can you post the output of /proc/cpuinfo (first CPU will be enough)

OH! Ok then, i didnt realize you actually talk directly to CPU, thats cool. Ok here is output of only core0:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 26
model       : 68
model name  : AMD Ryzen 9 9950X 16-Core Processor
stepping    : 0
microcode   : 0xb40401a
cpu MHz     : 5720.447
cache size  : 1024 KB
physical id : 0
siblings    : 32
core id     : 0
cpu cores   : 16
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 16
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx_vnni avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect movdiri movdir64b overflow_recov succor smca fsrm avx512_vp2intersect flush_l1d amd_lbr_pmc_freeze
bugs        : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8599.98
TLB size    : 192 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
cyring commented 3 weeks ago

So maybe you should adjust how the multipliers work

This is what also puzzles me. The minimum ratio is shown as 88 which is impossible. Ratios are computed from P-State registers. The AMD formula of the Coefficient Of Frequency(COF) could have be changed by manufacturer; making my function inaccurate to extract the ratio

https://github.com/cyring/CoreFreq/blob/8ca842b164f196d6205a401d469c3274315227e0/x86_64/corefreqk.c#L8110

madoverlord40 commented 3 weeks ago

So maybe you should adjust how the multipliers work

This is what also puzzles me. The minimum ratio is shown as 88 which is impossible. Ratios are computed from P-State registers. The AMD formula of the Coefficient Of Frequency(COF) could have be changed by manufacturer; making my function inaccurate to extract the ratio

https://github.com/cyring/CoreFreq/blob/8ca842b164f196d6205a401d469c3274315227e0/x86_64/corefreqk.c#L8110

Thats why i was thinking it could be a kernel issue, even though you are talking to the cpu, doesnt it need to go through the OS to do that? What if the kernel is miss-reporting the registers or the P-State values? Sorry if this has nothing to do with it, trying to help trouble shoot.