Closed uzername123 closed 3 years ago
According to register RCX
read or write access to MSR_SNB_EP_PMON_GLOBAL_CTRL
(0x00000c00
) is the issue.
Sorry, the Uncore PMU of Xeon E5 (06_2D
) is programmed with another set of MSR registers I have not implemented yet.
Can you please pull and test the develop
branch. I just have comment that Uncore part of software.
Compiled dev branch, seems all worked ok on E5-2689. At least no crash/kerneldump :)
Compiled dev branch, seems all worked ok on E5-2689. At least no crash/kerneldump :)
Great!
But we are missing the Uncore frequency in the view "Package cycles" and probably other things I could notice if you share corefreq-cli
outputs, screenshots.
Your Xeon is specified with Uncore and IMC box registers which depend on its topology, especially if multi-sockets. Are you OK to test new codes ? This may take time, about 5-15 working days, multiple testings and few crashes.
ok. i have several types of Xeon cpu , i could crash it almost freely - :) it is 2xL56xx, 2xE56xx, 2x X56xx, E5-26xx single cpu systems, single and dual e5-26xx v2 what system i need to use for test/crash?
ok. i have several types of Xeon cpu , i could crash it almost freely - :) it is 2xL56xx, 2xE56xx, 2x X56xx, E5-26xx single cpu systems, single and dual e5-26xx v2 what system i need to use for test/crash?
Hello,
Thanks for your help.
I would like to focus on Xeon SandyBridge EP processors (as I have already programmed for Westmere, a W3690 single socket)
Later, dual or quad sockets will be the purpose of improving the topology established by CoreFreq and the, per cluster, various measurements.
As a starter, any SNB, IVB with exactly the same CPUID signature of this issue, CPUID 06_2D
will be just fine to implement the Uncore MSR.
About the kernel choice, CentOS if not too old can do the job, but preferably a 5.X kernel version. Unloading, black-listing, the Linux and/or Vendor drivers will provide CoreFreq a full R/W access to registers. Especially blacklist the nmi_watchdog. Thus a bare-metal distribution will be perfect. My favorite being ArchLinux.
Of course: gcc, libc, kernel-headers, make, git as the Compilation prerequisites.
Any code editor of your choice: nano
, vi
Regards, Cyril
Btw, in the past, I had programmed IvyBridge/EP which shares in driver the same Uncore code than your SNB/EP
IVB/EP is CPUID 06_3E
Screenshot in Wiki https://gist.github.com/cyring/4c1a1f895e53ece642a52c368bdbaf3b
So your Xeon 06_2D
is really the Processor I need to work on.
Hello,
For your tests, latest develop
branch has new code for your Xeon 06_2D
Can you please give it a try ?
make clean all
sync
your FScorefreqd -d
UNCORE
counter is giving cyclescorefreq-cli -s
06_3E
) code.
If you have such processor, please provide me some outputs for verification.Thank you
ok. i compiled dev branch already. Yes, my main Workstation is dual 2680v2 and also have 2650v2 uniprocessor Will check all mentioned ASAP tomorrow My working OS is Centos 7.6 with 3.10.x kernel (requirements) so need to install/compile 7.9 with recent 5.x kernel (prefer not use 8.2/8.3) Btw, is it possible to make/use msr registers for unlock Turboboost modes or so in diff CPUs (say modify cores qty/frequencies for boost mode)?
I believe you can program directly MSR to alter the Turbo frequency. You will refer to the Intel SDM specifications to locate the tables associated with the CPUID. The bits layout and its programming logic differ per architecture. For example, with or without a semaphore bit to finalize the read-modify-write operation.
Here is on 2689 unicpu system 7,6 kernel 3.10.0.-967.27.2
Here is on 2689 unicpu system 7,6 kernel 3.10.0.-967.27.2
Thanks a lot for your test.
No more crash but apparently UNCORE:
remains at zero which means that its counter is not started. Thus there's more work to do.
Please let me know if you wish to pursue with code testings.
Regards
centos 7.6/kernel 3.10.0-957.27 - module crashed on insmod
MB is GA-x79-ud3 v1.0, cpu is Xeon e5-2689
=================== [ 270.192343] CoreFreq(0:8): Processor [ 06_2D] Architecture [SandyBridge/eXtreme.EP] SMT [16/16] [ 270.192446] general protection fault: 0000 [#1] SMP [ 270.192498] Modules linked in: corefreqk(OE+) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter dm_mirror dm_region_hash dm_log dm_mod nvidia_drm(POE) nvidia_modeset(POE) vfat fat nvidia(POE) iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul snd_hda_codec_hdmi ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel joydev pcspkr sg snd_hda_codec drm_kms_helper snd_hda_core i2c_i801 lpc_ich snd_hwdep snd_seq syscopyarea snd_seq_device sysfillrect [ 270.193334] sysimgblt fb_sys_fops snd_pcm drm snd_timer mei_me mei snd soundcore drm_panel_orientation_quirks ioatdma dca tpm_infineon nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mxm_wmi crct10dif_pclmul crct10dif_common crc32c_intel ahci serio_raw e1000e libahci libata ptp pps_core wmi [ 270.193393] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: P OE ------------ 3.10.0-957.27.2.el7.x86_64 #1 [ 270.193393] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UD3, BIOS F20 03/19/2014 [ 270.193393] task: ffffffffb2418480 ti: ffffffffb2400000 task.ti: ffffffffb2400000 [ 270.193393] RIP: 0010:[] [] Start_Uncore_SandyBridge_EP+0x5d/0x70 [corefreqk]
[ 270.193393] RSP: 0018:ffff9f83af203f58 EFLAGS: 00010046
[ 270.193393] RAX: 0000000020000000 RBX: ffff9f83a550d000 RCX: 0000000000000c00
[ 270.193393] RDX: 0000000000000000 RSI: ffff9f83a5509000 RDI: 0000000000000000
[ 270.193393] RBP: ffff9f83af203f58 R08: ffff9f83a5509000 R09: 000000000000003a
[ 270.193393] R10: 000000acb9ba665c R11: 0000000ad657b7b8 R12: 0000000ab38fe39e
[ 270.193393] R13: 000000009a3c7b50 R14: 00000000003e4692 R15: 0000009e5d79e054
[ 270.193393] FS: 0000000000000000(0000) GS:ffff9f83af200000(0000) knlGS:0000000000000000
[ 270.193393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 270.193393] CR2: 00007f074b896000 CR3: 0000000023610000 CR4: 00000000000607f0
[ 270.193393] Call Trace:
[ 270.193393]
[ 270.193393] [] Start_SandyBridge_EP+0x19f/0x270 [corefreqk]
[ 270.193393] [] flush_smp_call_function_queue+0x63/0x130
[ 270.193393] [] generic_smp_call_function_single_interrupt+0x13/0x30
[ 270.193393] [] smp_call_function_single_interrupt+0x2d/0x40
[ 270.193393] [] call_function_single_interrupt+0x162/0x170
[ 270.193393]
[ 270.193393] [] ? hrtimer_start_range_ns+0x1ed/0x3c0
[ 270.193393] [] ? cpuidle_enter_state+0x54/0xd0
[ 270.193393] [] ? cpuidle_enter_state+0x4d/0xd0
[ 270.193393] [] cpuidle_idle_call+0xde/0x230
[ 270.193393] [] arch_cpu_idle+0xe/0xc0
[ 270.193393] [] cpu_startup_entry+0x14a/0x1e0
[ 270.193393] [] rest_init+0x77/0x80
[ 270.193393] [] start_kernel+0x44b/0x46c
[ 270.193393] [] ? repair_env_string+0x5c/0x5c
[ 270.193393] [] ? early_idt_handler_array+0x120/0x120
[ 270.193393] [] x86_64_start_reservations+0x24/0x26
[ 270.193393] [] x86_64_start_kernel+0x154/0x177
[ 270.193393] [] start_cpu+0x5/0x14
[ 270.193393] Code: c2 48 c1 ea 20 0f 30 30 c9 0f 32 48 c1 e2 20 89 c0 48 09 c2 48 89 d0 48 89 96 38 01 00 00 48 0d 00 00 00 20 48 89 c2 48 c1 ea 20 <0f> 30 5d c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66
[ 270.193393] RIP [] Start_Uncore_SandyBridge_EP+0x5d/0x70 [corefreqk]
[ 270.193393] RSP