cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.97k stars 126 forks source link

[Intel/Atom/Airmont] Missing IMC of Pentium(R) CPU N3700 #366

Closed cyring closed 1 year ago

cyring commented 1 year ago

@svmlegacy Hello,

Refering to the Wiki page, corefreq-cli -m does not output the IMC data.

EDIT: Branch develop_Airmont is an attempt to decode Airmont IMC using the SLM/Bay Trail SoC decoder.

Please feel free to test and post the IMC output of this branch.

svmlegacy commented 1 year ago

Unfortunately this laptop is a family member's and is not currently available for testing... I'll dig through some of my untested laptops and see if any are Airmont/Braswell like this one.

The laptop which created the dump is an Asus X540SA, with the memory directly attached to the motherboard.

cyring commented 1 year ago

Unfortunately this laptop is a family member's and is not currently available for testing... I'll dig through some of my untested laptops and see if any are Airmont/Braswell like this one.

The laptop which created the dump is an Asus X540SA, with the memory directly attached to the motherboard.

Will it help if I'm proving a bootable ISO including the Airmont branch ?

svmlegacy commented 1 year ago

I was able to recreate the issue with a Airmont/Cherry Trail CPU, and the new branch seems to be outputting something. Please check for coherency. BIOS does not provide current speeds or timings.

Master Branch Develop Branch

cyring commented 1 year ago

I was able to recreate the issue with a Airmont/Cherry Trail CPU, and the new branch seems to be outputting something. Please check for coherency. BIOS does not provide current speeds or timings.

Master Branch Develop Branch

Thank you for those results.

Good you have this Cherry Trail platform, according to ark there are things to solve:

and to fix:

cyring commented 1 year ago

@svmlegacy Hello,

Can you please test those registers ?

## MSR_IA32_PLATFORM_ID
rdmsr -ax 0x00000017

## MSR_THERM2_CTL
rdmsr -ax 0x0000019d

## MSR_MISC_PWR_MGMT
rdmsr -ax 0x000001aa

## MSR_IA32_THERM_CONTROL
rdmsr -ax 0x0000019a

## MSR_TURBO_RATIO_LIMIT
rdmsr -ax 0x000001ad
svmlegacy commented 1 year ago

I'm not surprised if the core is hotter, the box is passively cooled and warm to touch, while the Braswell chip was actively cooled.

Burst appears to be working, as I do catch the CPU above 1.44 GHz in /proc/cpuinfo

Also: image

## MSR_IA32_PLATFORM_ID
rdmsr -ax 0x00000017

f090041752
f090041752
f090041449
f090041449

## MSR_THERM2_CTL
rdmsr -ax 0x0000019d

62d
62d
62d
62d

## MSR_MISC_PWR_MGMT
rdmsr -ax 0x000001aa

rdmsr: CPU 0 cannot read MSR 0x000001aa

## MSR_IA32_THERM_CONTROL
rdmsr -ax 0x0000019a

0
0
0
0

## MSR_TURBO_RATIO_LIMIT
rdmsr -ax 0x000001ad

0
0
0
0
cyring commented 1 year ago

Burst appears to be working, as I do catch the CPU above 1.44 GHz in /proc/cpuinfo

Thank you, branch is committed 69e370a51f39c168cb390259da119ccccfbd8ad0 to allow those MSR. Perhaps Burst frequency ratio will show up ? EDIT Listed among the Turbo Boost ratios of Processor window

cyring commented 1 year ago
|- Clock Modulation                                             ODCM   <Disable>
   |- DutyCycle                                                        [  0.00%]

Clock Modulation is now also permitted. You should be able to apply a DutyCycle percent with no processor crash. Please let me know if it is working safely ? It is supposed to lower frequency based on the selected percent.

cyring commented 1 year ago

Platform codename

Airmont is the common CPUID 06_4C

Stepping 3 is assigned to both Atom x5-Z8300 and Pentium N3700

Fortunately we can manage a brands table in CoreFreq driver to split in two platform codenames: Cherry Trail and Braswell

Code enhancement should also concern the PCI Host bridge id 8086:2280 which differs only in revision number (rev 22) vs (rev 21) according to lspci.

So far driver has no code to query the revision and I wonder if it's worth the effort to provide the true platform codename in the IMC output, rather than a common codename ?

svmlegacy commented 1 year ago

Just a quick update on this - I'm currently experimenting to get this system stable in Linux. I've been suffering random crashes, especially when the iGPU is loaded. Once it's stable I'll be able to test the duty cycle modulation. I haven't seen it lock up in a tty console, but I can't be certain it is stable. I may have to try alternative distro's to get it, or older kernels.

On platform codename - To me Cherry Trail and Braswell are effectively the same silicon, just with different target markets. I don't think it's worth the effort in splitting the codename without fully understanding if the Host Bridge revisions within each codename are unique. There seems to be some cherry trail chips with rev 20, for example)

cyring commented 1 year ago

Found these frequency and turbo ratios, and voltage VID registers in kernel. Apparently undocumented in SDM.

define MSR_ATOM_CORE_RATIOS 0x0000066a

define MSR_ATOM_CORE_VIDS 0x0000066b

define MSR_ATOM_CORE_TURBO_RATIOS 0x0000066c

define MSR_ATOM_CORE_TURBO_VIDS 0x0000066d

Can you please rdmsr these four MSR ?

svmlegacy commented 1 year ago

I seem to be having a stretch of stability after setting a max c-state of 1, similar to Bay Trail. See this: https://bugzilla.kernel.org/show_bug.cgi?id=109051

#define [MSR_ATOM_CORE_RATIOS](https://elixir.bootlin.com/linux/latest/C/ident/MSR_ATOM_CORE_RATIOS) 0x0000066a

# rdmsr -ax 0x0000066a
120602
120602
120602
120602

#define [MSR_ATOM_CORE_VIDS](https://elixir.bootlin.com/linux/latest/C/ident/MSR_ATOM_CORE_VIDS) 0x0000066b

# rdmsr -ax 0x0000066b
442d2d
442d2d
442d2d
442d2d

#define [MSR_ATOM_CORE_TURBO_RATIOS](https://elixir.bootlin.com/linux/latest/C/ident/MSR_ATOM_CORE_TURBO_RATIOS) 0x0000066c

# rdmsr -ax 0x0000066c
14141717
14141717
14141717
14141717

#define [MSR_ATOM_CORE_TURBO_VIDS](https://elixir.bootlin.com/linux/latest/C/ident/MSR_ATOM_CORE_TURBO_VIDS) 0x0000066d

# rdmsr -ax 0x0000066d
49495252
49495252
49495252
49495252
cyring commented 1 year ago

I seem to be having a stretch of stability after setting a max c-state of 1, similar to Bay Trail. See this: https://bugzilla.kernel.org/show_bug.cgi?id=109051

Well done! In the past I investigated a Silvermont/Bay_Trail crash, issue happened late after resuming from S3.

cyring commented 1 year ago

MSR_ATOM_CORE_RATIOS

# rdmsr -ax 0x0000066a
120602

Frequency

What we are getting new are the Turbo Boost frequency ratios.

cyring commented 1 year ago

Are those four MSR_ATOM writable ? Can you wrmsr them with same value as read ? Perhaps check if modified values are taken into account, especially the Turbo ratios.

cyring commented 1 year ago

Everything is now part of last commit 8521699cc8f3d5be002abecefd40f705bb2c600a You should discover the architecture code name and the Turbo Boost ratios. Feel free to refresh your CLI outputs.

svmlegacy commented 1 year ago

With my system all four MCR's are not writable.

Refreshed output: https://gist.github.com/svmlegacy/deb7288b6dbef976a0002a0ed29ae783#file-cherry-trail-develop-2022-11-06

cyring commented 1 year ago

With my system all four MCR's are not writable.

Refreshed output: https://gist.github.com/svmlegacy/deb7288b6dbef976a0002a0ed29ae783#file-cherry-trail-develop-2022-11-06

Thank you. Many things I've to process looking at your output.

May be to arrange temperature, you could go down to C1E and makes kernel idling with the HLT instruction.

See Readme to register CoreFreq as CPU-Idle driver. Next select HALT as the route in [Settings] , and C1E as the limit in [Kernel]

cyring commented 1 year ago

With my system all four MCR's are not writable.

OK, Commit 5c5b0a7c41dafcd6323b500b9b32044247d891dc to make them as read-only ratios

cyring commented 1 year ago

@svmlegacy I don't get a C-States Base Address within the output, can you read this MSR ?

rdmsr -ax 0xE4

EDIT: It may also be a BIOS option.

cyring commented 1 year ago

Keep this Intel® Atom™ Z8000 Processor Series Datasheet (Volume 2 of 2)

cyring commented 1 year ago

@svmlegacy Can you please dump some registers ? About this source code line in function SLM_PTR, add debugging code as below:

/* Error Correcting Code */
    TIMING(mc, cha).ECC = \
              RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.EFF_ECC_EN
            | RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.ECC_EN;
/* Debugging Code */
    printf( "IMC(%d:%d)\tDRP[%x]\tDTR0[%x]\tBIOS[%x]\n", mc, cha,
        RO(Proc)->Uncore.MC[mc].SLM.DRP.value,
        RO(Proc)->Uncore.MC[mc].SLM.DTR0.value,
        RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.value );

and post the traces which should output from daemon corefreqd

cyring commented 1 year ago

Keep this Intel® Atom™ Z8000 Processor Series Datasheet (Volume 2 of 2)

@svmlegacy What you will notice in specs is that IMC registers differ from those of Silvermont/BYT; some bits are shifted of just a few position. Now I don't confirm previous Timings without a reference coming from your BIOS, memtest, or SPD. The Z8000's IMC decoder has to be programmed.

cyring commented 1 year ago

@svmlegacy Hello, I would like to merge current developments into master; but it would be nice to finalize Airmont IMC before. Is your platform still available ?

svmlegacy commented 1 year ago

@cyring

Thanks for your patience... Life was just being life for a bit. Should be able to catch back up this weekend!

svmlegacy commented 1 year ago

@svmlegacy I don't get a C-States Base Address within the output, can you read this MSR ?

rdmsr -ax 0xE4

EDIT: It may also be a BIOS option.

Result:

# rdmsr -ax 0xE4
20000
20000
20000
20000

@svmlegacy Can you please dump some registers ? About this source code line in function SLM_PTR, add debugging code as below:

/* Error Correcting Code */
  TIMING(mc, cha).ECC = \
            RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.EFF_ECC_EN
          | RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.ECC_EN;
/* Debugging Code */
  printf( "IMC(%d:%d)\tDRP[%x]\tDTR0[%x]\tBIOS[%x]\n", mc, cha,
      RO(Proc)->Uncore.MC[mc].SLM.DRP.value,
      RO(Proc)->Uncore.MC[mc].SLM.DTR0.value,
      RO(Proc)->Uncore.MC[mc].SLM.BIOS_CFG.value );

and post the traces which should output from daemon corefreqd

Result:

# ./corefreqd
CoreFreq Daemon 1.93.0  Copyright (C) 2015-2022 CYRIL INGENIERIE
IMC(0:0)    DRP[58091]  DTR0[1344c630]  BIOS[468f0010]

Keep this Intel® Atom™ Z8000 Processor Series Datasheet (Volume 2 of 2)

@svmlegacy What you will notice in specs is that IMC registers differ from those of Silvermont/BYT; some bits are shifted of just a few position. Now I don't confirm previous Timings without a reference coming from your BIOS, memtest, or SPD. The Z8000's IMC decoder has to be programmed.

Unfortunately the BIOS is very limited and does not provide the information. The memory is soldered directly onto the mainboard, as well. I'll see about getting Memtest running on this machine and posting the results.

cyring commented 1 year ago

@svmlegacy Fyi commit ff64902ede51b63996391edfe9b85060fb6f2380 adds specs of Z8000 IMC registers

cyring commented 1 year ago

On-going IMC developments moved to #395