cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.94k stars 127 forks source link

Unraid crash on installation - Intel(R) Celeron(R) CPU J3455 #477

Closed ich777 closed 4 months ago

ich777 commented 4 months ago

I've link two posts from the Unraid forums here and here (same user).

I think this is a QNAP system running Unraid.

cyring commented 4 months ago

image

J3455 has crashed when reading MSR 0x64c in register RCX I have to search if this MSR_TURBO_ACTIVATION_RATIO is allowed or not in architecture. Perhaps an exception model. Meanwhile can you post the CPUID signature of your processor ? Such as the results oflscpu

cyring commented 4 months ago
Qnap kernel: smpboot: CPU0: Intel(R) Celeron(R) CPU J3455 @ 1.50GHz (family: 0x6, model: 0x5c, stepping: 0x9)

CoreFreq will identify family as Goldmont(06_5Ch)

But architecture says the MSR is allowed!

2024-02-20-205322_754x98_scrot 2024-02-20-205422_761x376_scrot

and crash happened here: https://github.com/cyring/CoreFreq/blob/6bbbb752d0efcfcf42daffbcc78ad38640acbade/x86_64/corefreqk.c#L9990

EDIT:

cyring commented 4 months ago

2024-02-20-211644_642x124_scrot

As specified by the ARK for Celeron J3455, if processor is not capable of Intel Turbo Boost Technology, there is no reason to trigger MSR Turbo Activation Ratio

ich777 commented 4 months ago

This is lscpu from the Diagnostics.zip: lscpu.txt

cyring commented 4 months ago

This is lscpu from the Diagnostics.zip: lscpu.txt

Thank you.

Fix commit 91ed011ef1f99b416b57735ddcf8fc66a35aea7f is ready in branch develop.

Can you please provide a testing version to User ?

ich777 commented 4 months ago

I've compiled the package and let the user know here.

ich777 commented 4 months ago

See his comment here.

BTW most Unraid users don‘t have a GitHub account.

cyring commented 4 months ago

See his comment here.

BTW most Unraid users don‘t have a GitHub account.

For another test, commenting the crashing instructions in this archive CoreFreq_Goldmont.tar.gz

ich777 commented 4 months ago

I'll build him this version and post it on the forums.

cyring commented 4 months ago

Programming notes:

  1. Turbo Boost Technology

    • Checking CPUID leaf 0x6 for IDA is not enough to guess Turbo capability That bit can be enabled because of LFM support
    • If LFM then we should read MSR IA32_MISC_ENABLE (0x1A0) to check the status of bit 38 2024-02-21-173951_760x68_scrot 2024-02-21-174009_752x238_scrot
    • If bit value is one then don't call MSR_TURBO_ACTIVATION_RATIO
  2. DMI string

    • If testing turbo is not enough to avoid a crash; my assumption is that MSR_TURBO_ACTIVATION_RATIO is triggering a firmware implementation which is not encoded or activated into QNAP.
    • I will then fetch for the Manufacturer string and blacklist MSR_TURBO_ACTIVATION_RATIO conditionally
  3. Remaining question

    • Does a Goldmont exist with MSR_TURBO_ACTIVATION_RATIO support ?
    • I don't have received a CoreFreq execution report of another Goldmont yet ; of architecture 06_5c ; running on desktop motherboard. Anyone can provide this ?
ich777 commented 4 months ago

Package built and upload to the forums.

cyring commented 4 months ago

Package built and upload to the forums.

Thank you very much

cyring commented 4 months ago
Processor                              [Intel(R) Celeron(R) CPU J3455 @ 1.50GHz]
|- Architecture                                                  [Atom/Goldmont]
|- Vendor ID                                                      [GenuineIntel]
|- Microcode                                                        [0x00000048]
|- Signature                                                           [  06_5C]
|- Stepping                                                            [      9]
|- Online CPU                                                          [  4/  4]
|- Base Clock                                                          [ 99.842]
|- Frequency            (MHz)                      Ratio                        
                 Min    798.73                    <   8 >                       
                 Max   1497.62                    <  15 >                       
|- Factory                                                             [100.000]
                       1500                       [  15 ]                       
|- Performance                                                                  
   |- P-State                                                                   
                 TGT   2296.20                    <  23 >                       
|- Turbo Boost                                                         [   LOCK]
                  1C   2296.36                    <  23 >                       
                  2C   2196.51                    <  22 >                       
                  3C   2196.51                    <  22 >                       
                  4C   2196.51                    <  22 >                       
|- Uncore                                                              [   LOCK]
|- TDP                                                           Level [  0:0  ]
   |- Programmable                                                     [   LOCK]
   |- Configuration                                                    [   LOCK]
   |- Turbo Activation                                                 [   LOCK]
               Turbo      AUTO                    [   0 ]                       

Instruction Set Extensions                                                      
|- 3DNow!/Ext [N/N]          ADX [N]          AES [Y]  AVX/AVX2 [N/N] 
|- AMX-BF16     [N]     AMX-TILE [N]     AMX-INT8 [N]    AMX-FP16 [N] 
|- AVX512-F     [N]    AVX512-DQ [N]  AVX512-IFMA [N]   AVX512-PF [N] 
|- AVX512-ER    [N]    AVX512-CD [N]    AVX512-BW [N]   AVX512-VL [N] 
|- AVX512-VBMI  [N] AVX512-VBMI2 [N]  AVX512-VNNI [N]  AVX512-ALG [N] 
|- AVX512-VPOP  [N] AVX512-VNNIW [N] AVX512-FMAPS [N] AVX512-VP2I [N] 
|- AVX512-BF16  [N] AVX-VNNI-VEX [N] AVX-VNN-INT8 [N] AVX-NE-CONV [N] 
|- AVX-IFMA     [N]    CMPccXADD [N]      MOVDIRI [N]   MOVDIR64B [N] 
|- BMI1/BMI2  [N/N]         CLWB [N]      CLFLUSH [Y] CLFLUSH-OPT [Y] 
|- CLAC-STAC    [Y]         CMOV [Y]    CMPXCHG8B [Y]  CMPXCHG16B [Y] 
|- F16C         [N]          FPU [Y]         FXSR [Y]   LAHF-SAHF [Y] 
|- ENQCMD       [N]         GFNI [N]        OSPKE [N]     WAITPKG [N] 
|- MMX/Ext    [Y/N] MON/MWAITX [N/N]        MOVBE [Y]   PCLMULQDQ [Y] 
|- POPCNT       [Y]       RDRAND [Y]       RDSEED [Y]      RDTSCP [Y] 
|- SEP          [Y]          SHA [Y]          SSE [Y]        SSE2 [Y] 
|- SSE3         [Y]        SSSE3 [Y]  SSE4.1/4A [Y/N]      SSE4.2 [Y] 
|- SERIALIZE    [N]      SYSCALL [Y]        RDPID [N]         SGX [N] 
|- VAES         [N]   VPCLMULQDQ [N]   PREFETCH/W [Y]       LZCNT [N] 

Features                                                                        
|- 1 GB Pages Support                                      1GB-PAGES   [Capable]
|- Advanced Configuration & Power Interface                     ACPI   [Capable]
|- Advanced Programmable Interrupt Controller                   APIC   [Capable]
|- APIC Timer Invariance                                        ARAT   [Capable]
|- Core Multi-Processing                                  CMP Legacy   [Missing]
|- L1 Data Cache Context ID                                  CNXT-ID   [Missing]
|- Direct Cache Access                                           DCA   [Missing]
|- Debugging Extension                                            DE   [Capable]
|- Debug Store & Precise Event Based Sampling               DS, PEBS   [Capable]
|- CPL Qualified Debug Store                                  DS-CPL   [Capable]
|- 64-Bit Debug Store                                         DTES64   [Capable]
|- Fast Short REP CMPSB                                         FSRC   [Missing]
|- Fast Short REP MOVSB                                         FSRM   [Missing]
|- Fast Short REP STOSB                                         FSRS   [Missing]
|- Fast Zero-length REP MOVSB                                   FZRM   [Missing]
|- Fast-String Operation                                        ERMS   [Capable]
|- Fused Multiply Add                                            FMA   [Missing]
|- Flexible Return and Event Delivery                           FRED   [Missing]
|- Hardware Feedback Interface                                   HFI   [Missing]
|- Hardware Lock Elision                                         HLE   [Missing]
|- Hyper-Threading Technology                                    HTT   [Capable]
|- History Reset                                              HRESET   [Missing]
|- Hybrid part processor                                      HYBRID   [Missing]
|- Instruction Based Sampling                                    IBS   [Missing]
|- Instruction INVPCID                                       INVPCID   [Missing]
|- Long Mode 64 bits                                       IA64 | LM   [Capable]
|- Linear Address Space Separation                              LASS   [Missing]
|- Linear Address Masking                                        LAM   [Missing]
|- Load Kernel GS segment register                              LKGS   [Missing]
|- LightWeight Profiling                                         LWP   [Missing]
|- Machine-Check Architecture                                    MCA   [Capable]
|- Memory Protection Extensions                                  MPX   [Capable]
|- Model Specific Registers                                      MSR   [Capable]
|- Memory Type Range Registers                                  MTRR   [Capable]
|- No-Execute Page Protection                                     NX   [Capable]
|- OS-Enabled Ext. State Management                          OSXSAVE   [Capable]
|- Physical Address Extension                                    PAE   [Capable]
|- Page Attribute Table                                          PAT   [Capable]
|- Pending Break Enable                                          PBE   [Capable]
|- Platform Configuration                                    PCONFIG   [Missing]
|- Process Context Identifiers                                  PCID   [Missing]
|- Perfmon and Debug Capability                                 PDCM   [Capable]
|- Page Global Enable                                            PGE   [Capable]
|- Page Size Extension                                           PSE   [Capable]
|- 36-bit Page Size Extension                                  PSE36   [Capable]
|- Processor Serial Number                                       PSN   [Missing]
|- Write Data to a Processor Trace Packet                    PTWRITE   [Missing]
|- PREFETCHIT0/1 Instructions                              PREFETCHI   [Missing]
|- Resource Director Technology/PQE                            RDT-A   [Capable]
|- Resource Director Technology/PQM                            RDT-M   [Missing]
|- Restricted Transactional Memory                               RTM   [Missing]
|- Safer Mode Extensions                                         SMX   [Missing]
|- Self-Snoop                                                     SS   [Capable]
|- Supervisor-Mode Access Prevention                            SMAP   [Capable]
|- Supervisor-Mode Execution Prevention                         SMEP   [Capable]
|- Thread Director                                                TD   [Missing]
|- Time Stamp Counter                                            TSC [Invariant]
|- Time Stamp Counter Deadline                          TSC-DEADLINE   [Capable]
|- TSX Force Abort MSR Register                            TSX-ABORT   [Missing]
|- TSX Suspend Load Address Tracking                       TSX-LDTRK   [Missing]
|- User-Mode Instruction Prevention                             UMIP   [Missing]
|- Virtual Mode Extension                                        VME   [Capable]
|- Virtual Machine Extensions                                    VMX   [Capable]
|- Write Back & Do Not Invalidate Cache                     WBNOINVD   [Missing]
|- Extended xAPIC Support                                     x2APIC   [ x2APIC]
|- Execution Disable Bit Support                              XD-Bit   [Capable]
|- XSAVE/XSTOR States                                          XSAVE   [Capable]
|- xTPR Update Control                                          xTPR   [Capable]
Mitigation mechanisms                                                           
|- Indirect Branch Restricted Speculation                       IBRS   [Capable]
|- Indirect Branch Prediction Barrier                           IBPB   [Capable]
|- Single Thread Indirect Branch Predictor                     STIBP   [Capable]
|- Speculative Store Bypass Disable                             SSBD   [ Unable]
|- Writeback & invalidate the L1 data cache                L1D-FLUSH   [ Unable]
|- Hypervisor - No flush L1D on VM entry            L1DFL_VMENTRY_NO   [ Enable]
|- Arch - Buffer Overwriting                                MD-CLEAR   [Capable]
|- Arch - No Rogue Data Cache Load                           RDCL_NO   [ Enable]
|- Arch - Enhanced IBRS                                     IBRS_ALL   [ Enable]
|- Arch - Return Stack Buffer Alternate                         RSBA   [Capable]
|- Arch - No Speculative Store Bypass                         SSB_NO   [ Enable]
|- Arch - No Microarchitectural Data Sampling                 MDS_NO   [ Enable]
|- Arch - No TSX Asynchronous Abort                           TAA_NO   [Capable]
|- Arch - No Page Size Change MCE                     PSCHANGE_MC_NO   [ Enable]
|- Arch - STLB QoS                                              STLB   [ Unable]
|- Arch - Functional Safety Island                              FuSa   [ Unable]
|- Arch - RSM in CPL0 only                                       RSM   [ Unable]
|- Arch - Split Locked Access Exception                         SPLA   [ Unable]
|- Arch - Snoop Filter QoS Mask                         SNOOP_FILTER   [ Unable]
|- Arch - No Fast Predictive Store Forwarding                   PSFD   [ Unable]
|- Arch - Data Operand Independent Timing Mode                 DOITM   [ Unable]
|- Arch - Not affected by SBDR or SSDP                  SBDR_SSDP_NO   [Capable]
|- Arch - No Fill Buffer Stale Data Propagator              FBSDP_NO   [Capable]
|- Arch - No Primary Stale Data Propagator                   PSDP_NO   [Capable]
|- Arch - Overwrite Fill Buffer values                      FB_CLEAR   [Capable]
|- Arch - Special Register Buffer Data Sampling                SRBDS   [ Unable]
   |- RDRAND and RDSEED mitigation                             RNGDS   [ Unable]
   |- Restricted Transactional Memory                            RTM   [ Unable]
   |- Verify Segment for Writing instruction                    VERW   [ Unable]
|- Arch - Restricted RSB Alternate                             RRSBA   [Capable]
|- Arch - No Branch Target Injection                          BHI_NO   [Capable]
|- Arch - Legacy xAPIC Disable                             XAPIC_DIS   [ Unable]
|- Arch - No Post-Barrier Return Stack Buffer               PBRSB_NO   [Capable]
|- Arch - IPRED disabled for CPL3                        IPRED_DIS_U   [ Unable]
|- Arch - IPRED disabled for CPL0/1/2                    IPRED_DIS_S   [ Unable]
|- Arch - RRSBA disabled for CPL3                        RRSBA_DIS_U   [ Unable]
|- Arch - RRSBA disabled for CPL0/1/2                    RRSBA_DIS_S   [ Unable]
|- Arch - Data Dependent Prefetcher CPL3                  DDPD_U_DIS   [ Unable]
|- Arch - BHI disabled for CPL0/1/2                        BHI_DIS_S   [ Unable]
|- No MXCSR Configuration Dependent Timing                   MCDT_NO   [ Unable]
|- Overclocking                                                                 
   |- Overclocking Utilized                                 UTILIZED   [Capable]
   |- Undervolt Protection                                       UVP   [Capable]
   |- Overclocking Secure Status                            UNLOCKED   [Capable]
Security Features                                                               
|- CPUID Key Locker                                               KL   [Missing]
|- AES Key Locker instructions                                AESKLE   [Missing]
|- CET Shadow Stack features                                  CET-SS   [Missing]
|- CET Indirect Branch Tracking                              CET-IBT   [Missing]
|- CET Supervisor Shadow Stack                               CET-SSS   [Missing]
|- AES Wide Key Locker instructions                          WIDE_KL   [Missing]
|- Software Guard SGX1 Extensions                               SGX1   [Missing]
|- Software Guard SGX2 Extensions                               SGX2   [Missing]

Technologies                                                                    
|- Data Cache Unit                                                              
   |- L1 Prefetcher                                                L1 HW   < ON>
   |- L1 IP Prefetcher                                          L1 HW IP   < ON>
   |- L1 Next Page Prefetcher                                     L1 NPP   < ON>
   |- L1 Scrubbing                                          L1 Scrubbing   <OFF>
|- Cache Prefetchers                                                            
   |- L2 Prefetcher                                                L2 HW   < ON>
   |- L2 Adjacent Cache Line Prefetcher                         L2 HW CL   < ON>
   |- L2 Adaptive Multipath Probability                           L2 AMP   <OFF>
   |- L2 Next Line Prefetcher                                     L2 NLP   <OFF>
   |- LLC Streamer                                                   LLC   <OFF>
|- System Management Mode                                       SMM-Dual   [OFF]
|- Hyper-Threading                                                   HTT   [OFF]
|- SpeedStep                                                        EIST   < ON>
|- Dynamic Acceleration                                              IDA   [ ON]
|- Turbo Boost                                                     TURBO   < ON>
|- Energy Efficiency Optimization                                    EEO   <OFF>
|- Race To Halt Optimization                                         R2H   <OFF>
|- Watchdog Timer                                                    TCO   <OFF>
|- Virtualization                                                    VMX   [ ON]
   |- I/O MMU                                                       VT-d   [OFF]
   |- Version                                                     [         N/A]
   |- Hypervisor                                                           [OFF]
   |- Vendor ID                                                   [         N/A]

Performance Monitoring                                                          
|- Version                                                        PM       [  4]
|- Counters:          General                   Fixed                           
|           {  4,  0,  0 } x 48 bits            3 x 48 bits                     
|- Enhanced Halt State                                           C1E       < ON>
|- C1 Auto Demotion                                              C1A       <OFF>
|- C3 Auto Demotion                                              C3A       <OFF>
|- C1 UnDemotion                                                 C1U       <OFF>
|- C3 UnDemotion                                                 C3U       <OFF>
|- C6 Core Demotion                                              CC6       <OFF>
|- C6 Module Demotion                                            MC6       <OFF>
|- Legacy Frequency ID control                                   FID       [OFF]
|- Legacy Voltage ID control                                     VID       [OFF]
|- P-State Hardware Coordination Feedback                MPERF/APERF       [ ON]
|- Hardware Duty Cycling                                         HDC       [OFF]
|- Package C-States                                                             
   |- Configuration Control                                   CONFIG   [   LOCK]
   |- Lowest C-State                                           LIMIT   <     C3>
   |- I/O MWAIT Redirection                                  IOMWAIT   < Enable>
   |- Max C-State Inclusion                                    RANGE   <    UNS>
|- Core C-States                                                                
   |- C-States Base Address                                      BAR   [ 0x414 ]
|- ACPI Processor C-States                                      _CST   [      3]
|- MONITOR/MWAIT                                                                
   |- State index:    #0    #1    #2    #3    #4    #5    #6    #7              
   |- Sub C-State:     0     2     0     2     4     2     1     1              
   |- Monitor-Mwait Extensions                                   EMX   [Missing]
   |- Interrupt Break-Event                                      IBE   [Missing]
|- Core Cycles                                                         [Capable]
|- Instructions Retired                                                [Capable]
|- Reference Cycles                                                    [Capable]
|- Last Level Cache References                                         [Capable]
|- Last Level Cache Misses                                             [Capable]
|- Branch Instructions Retired                                         [Capable]
|- Branch Mispredicts Retired                                          [Capable]
|- Top-down slots Counter                                              [Capable]
|- Processor Performance Control                                _PCT   [ Enable]
|- Performance Supported States                                 _PSS   [      9]
|- Performance Present Capabilities                             _PPC   [      0]

Power, Current & Thermal                                                        
|- Temperature Offset:Junction                                 TjMax <  0:105 C>
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]
|- Power Management                                         PWR MGMT   [   LOCK]
   |- Energy Policy                                        Bias Hint   [      0]
|- Digital Thermal Sensor                                        DTS   [Capable]
|- Power Limit Notification                                      PLN   [Capable]
|- Package Thermal Management                                    PTM   [Capable]
|- Thermal Monitor 1                                             TM1   [ Enable]
|- Thermal Monitor 2                                             TM2   [Capable]
|- Thermal Design Power                                          TDP   [   10 W]
   |- Minimum Power                                              Min   [Missing]
   |- Maximum Power                                              Max   [Missing]
|- Thermal Design Power                                      Package   < Enable>
   |- Power Limit                                                PL1   <   10 W>
   |- Time Window                                                TW1   <    8 s>
   |- Power Limit                                                PL2   <   25 W>
   |- Time Window                                                TW2   < 976 us>
|- Thermal Design Power                                         Core   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [ 976 us]
|- Thermal Design Power                                       Uncore   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [ 976 us]
|- Thermal Design Power                                         DRAM   <Disable>
   |- Power Limit                                                PL1   <    0 W>
   |- Time Window                                                TW1   < 976 us>
|- Thermal Design Power                                     Platform   [Disable]
   |- Power Limit                                                PL1   [    0 W]
   |- Time Window                                                TW1   [ 976 us]
   |- Power Limit                                                PL2   [    0 W]
   |- Time Window                                                TW2   [ 976 us]
|- Electrical Design Current                                     EDC   [Missing]
|- Thermal Design Current                                        TDC   [Missing]
|- Core Thermal Point                                                           
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Package Thermal Point                                                        
   |- DTS Threshold #1                                     Threshold   [Missing]
   |- DTS Threshold #2                                     Threshold   [Missing]
|- Units                                                                        
   |- Power                                               watt   [  0.000003906]
   |- Energy                                             joule   [  0.000000061]
   |- Window                                            second   [  0.000976562]

CPU Pkg  Apic  Core/Thread  Caches      (w)rite-Back (i)nclusive              
 #   ID   ID    ID     ID  L1-Inst Way  L1-Data Way      L2  Way      L3  Way 
000:BSP    0     0      0    32768  8w    24576  6w  1048576 16w        0  0  
001:  0    2     1      0    32768  8w    24576  6w  1048576 16w        0  0  
002:  0    4     2      0    32768  8w    24576  6w  1048576 16w        0  0  
003:  0    6     3      0    32768  8w    24576  6w  1048576 16w        0  0  

                            GenuineIntel  [   0]                           
cyring commented 4 months ago
[ 0] American Megatrends Inc.                                                   
[ 1] QY47AR54                                                                   
[ 2] 09/21/2017                                                                 
[ 3] Default string                                                             
[ 4] Default string                                                             
[ 5] Default string                                                             
[ 6] D---u---s---n-                                                             
[ 7] Default string                                                             
[ 8] Default string                                                             
[ 9] AMI Corporation                                                            
[10] Aptio CRB                                                                  
[11] Default string                                                             
[12] D---u---s---n-                                                             
[13] Number Of Devices:2\Maximum Capacity:8388608 kilobytes                     
[14] A1_DIMM0\A1_BANK0                                                          
[15] A1_DIMM1\A1_BANK1                                                          
[16]                                                                            
[17]                                                                            
[18] Kingston                                                                   
[19] Kingston                                                                   
[20]                                                                            
[21]                                                                            
[22] A1_AssetTagNum0                                                            
[23] A1_AssetTagNum1                                                            
[24]                                                                            
[25]                                                                            
cyring commented 4 months ago

I'm preparing a fix for the Vcore

Voltage = VID / 8192

Range would be from 0.8V up to 1.2V

cyring commented 4 months ago

@ich777 Hello,

Can you please make a plugin package for User based on the following archive:

CoreFreq_Goldmont_IMC.tar.gz

For a future decoder, this will attempt to map the memory controller and dump a range of registers.

--- DEVICE  (DUMP) ---
...
--- MCHBAR (START) ---
...
--- MCHBAR  (STOP) ---

Thank you

ich777 commented 4 months ago

Sure, it will take me a few hours.

cyring commented 4 months ago
ich777 commented 4 months ago

@cyring can you give me a little heads up when you release the next version from CoreFreq so that I can compile a new package for Unraid based on the new release please...?

cyring commented 4 months ago

@cyring can you give me a little heads up when you release the next version from CoreFreq so that I can compile a new package for Unraid based on the new release please...?

Sure. I will ask User if latest code is stable ?

cyring commented 4 months ago

@ich777 According to User, J3455 is now working fine. So we can proceed with a plugin creation. Regards

ich777 commented 4 months ago

@cyring do you still push Releases here to GitHub or at least Tags since my build toolchain is based on Tags. All packages for Unraid are based on CoreFreq 1.96.5 1.97.0 currently. I can of course change that to always use the master branch.