Bogus instruction count values reported

h1z1 commented 4 years ago

tbh Not sure if this is a bug with corefreq or the platform? Seems pretty random bouncing around different cores.

one

cyring commented 4 years ago

So I'm checking my Zen2 and getting this:
INST MSR counter returns by Whitehaven seems to report zero and probably lower bits length values. Probably with this platform, register is not a full 64-bits counter. I'm going back to the PPR specs for an answer; please let me know if you also find some errata or some platform specifics ?
FYI, here is the code reading the counter registers https://github.com/cyring/CoreFreq/blob/1887c7b554288e3e9e1f359c657c94a68b832be7/corefreqk.c#L8610

cyring commented 4 years ago

So I'm going back to the " PPR for AMD Family 17h Model 01h, Revision B1 Processors " for the specification of the Instructions Retired Performance Counter

which is a SMT 64-bits register ( _thread[1:0] )

Can you give me the raw values, after you have started the CoreFreq driver (which will activate this counter in the HWCR register)
```
modprobe msr
rdmsr -aX 0xC00000E9
```
After above operation, I will also need a dump of CPUID leaf 80000008

h1z1 commented 4 years ago

Sorry for the late reply

two

Just confirming the same output after the rdmsr

three

And fwiw

[root@wacko CoreFreq]# ./corefreq-cli 
[root@wacko CoreFreq]# rdmsr -aX 0xC00000E9 | xargs
1C6 1C6 1C6 F6BC 17B68 1C6 3234 1C6 E4507 1C6 150D 1C6 14237E D3048 1C6 1C6 401F23 1C7 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 1C6 CB5

The initial read was all zeros

cyring commented 4 years ago

Can you stress one Core per CCD. Use the CoreFreq Tools and Topology windows. In the same time, get the raw counters elapsed by one second interval:

rdmsr -aX 0xC00000E9 
sleep 1
rdmsr -aX 0xC00000E9

B/c I want the highest counter value to guess its bits width but also to understand its sampling period. I suspect EPYC and TR v1 are counting differently than Zen v?

According to your CPUID, the INST counter is capable. Please also post the output of corefreq-cli -s

Thank you

h1z1 commented 4 years ago

Hmm... turboing a core in corefreq causes that core to report accurately. Not entirely sure why the units changed from scientific form..

# while true; do c=$(rdmsr -ad 0xC00000E9 |tail -1); sleep 1; l=$(rdmsr -ad 0xC00000E9 |tail -1);  echo $((l - c));    done
10776668920
10779641649
10776669139
10775183308
10775979373
10775464271
10776362897
10777289285
10777553372
^C

disabled turbo on CPU 31

# while true; do c=$(rdmsr -ad 0xC00000E9 |tail -1); sleep 1; l=$(rdmsr -ad 0xC00000E9 |tail -1);  echo $((l - c));    done
-2896
0
-2885
0
0
0
0
0
0
0
0
0
#

Guessing this might help? You can see where it settled down after the core was turboed

# while true; do ./corefreq-cli -j | jq -c '.["Cpu"][25]["FlipFlop"][0]["State"]'  ; sleep 1; done
{"IPS":0,"IPC":5e-06,"CPI":216882,"Turbo":0.000104,"C0":0.000121,"C3":0,"C6":0,"C7":0,"C1":0.999879,"MC6":0}
{"IPS":0,"IPC":5e-06,"CPI":216882,"Turbo":0.000104,"C0":0.000121,"C3":0,"C6":0,"C7":0,"C1":0.999879,"MC6":0}
{"IPS":0,"IPC":0,"CPI":0,"Turbo":0.000184,"C0":0.000209,"C3":0,"C6":0,"C7":0,"C1":0.999791,"MC6":0}
{"IPS":0,"IPC":0,"CPI":0,"Turbo":0.000184,"C0":0.000209,"C3":0,"C6":0,"C7":0,"C1":0.999791,"MC6":0}
{"IPS":0,"IPC":1e-06,"CPI":943884,"Turbo":0.000224,"C0":0.000263,"C3":0,"C6":0,"C7":0,"C1":0.999737,"MC6":0}
{"IPS":0,"IPC":1e-06,"CPI":943884,"Turbo":0.000224,"C0":0.000263,"C3":0,"C6":0,"C7":0,"C1":0.999737,"MC6":0}
{"IPS":5133771863.251443,"IPC":46430731932135.11,"CPI":0,"Turbo":9.6e-05,"C0":0.000111,"C3":0,"C6":0,"C7":0,"C1":0.99989,"MC6":0}
{"IPS":5133771863.251443,"IPC":46430731932135.11,"CPI":0,"Turbo":9.6e-05,"C0":0.000111,"C3":0,"C6":0,"C7":0,"C1":0.99989,"MC6":0}
{"IPS":0.2494,"IPC":2.990874,"CPI":0.33435,"Turbo":0.08337,"C0":0.083387,"C3":0,"C6":0,"C7":0,"C1":0.916613,"MC6":0}
{"IPS":0.2494,"IPC":2.990874,"CPI":0.33435,"Turbo":0.08337,"C0":0.083387,"C3":0,"C6":0,"C7":0,"C1":0.916613,"MC6":0}
{"IPS":2.994897,"IPC":2.994896,"CPI":0.333901,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.994897,"IPC":2.994896,"CPI":0.333901,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.995003,"IPC":2.995003,"CPI":0.33389,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.995003,"IPC":2.995003,"CPI":0.33389,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.994931,"IPC":2.994931,"CPI":0.333897,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.994931,"IPC":2.994931,"CPI":0.333897,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.994943,"IPC":2.994943,"CPI":0.333896,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.994943,"IPC":2.994943,"CPI":0.333896,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.993228,"IPC":2.993228,"CPI":0.334087,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.993228,"IPC":2.993228,"CPI":0.334087,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.993031,"IPC":2.993031,"CPI":0.334109,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}
{"IPS":2.993031,"IPC":2.993031,"CPI":0.334109,"Turbo":1,"C0":1,"C3":0,"C6":0,"C7":0,"C1":0,"MC6":0}

cyring commented 4 years ago

I'm checking code and don't understand why counter is read as zero Beside computing IPC and CPI where a test is made to protect a division by zero at: https://github.com/cyring/CoreFreq/blob/1887c7b554288e3e9e1f359c657c94a68b832be7/corefreqd.c#L1011 It would mean URC is dropping to zero too ! Can you please show the Core cycles view 2020-10-29-010712_644x550_scrot I have not found among previous issues a full report of CoreFreq with Zen gen 1 beside one Epyc/Napple . I don't know if the same happens with 1700X , 1800X and so on

h1z1 commented 4 years ago

Both turbo and non-turbo core-cycles appear OK

four six

Maybe related... noticed the cli was really slow (much of it not rendering at all) after running overnight. corefreqd was taking 200% CPU. Still at 100% after a restart.

USER      PR  NI    RES    SHR S  P  %CPU %MEM     TIME+ COMMAND               
root      20   0   1.3m   1.1m S 16 100.1  0.0  12:52.11 corefreqd-cmgr

Hmm.. some fuckery somewhere.. would appear the weird values in corefreq coincides with negative values being reported even with rdmsr

# rd() { rdmsr -ad -p $1 0xC00000E9 |head -24 | head -1; } ;
#  { for i in {1..30}; do c=$(rd 31); sleep 0.1; l=$(rd 31);  printf  "last:%s current:%s diff:%s\n" "$l" "$c" $((l - c));    done  ; echo; }
last:369 current:369 diff:0
last:369 current:369 diff:0
last:3264 current:369 diff:2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3264 diff:-2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:368 current:369 diff:-1
last:369 current:3253 diff:-2884
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3253 diff:-2884
last:13965 current:369 diff:13596
last:3253 current:369 diff:2884
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3253 diff:-2884
last:369 current:369 diff:0
last:2069 current:369 diff:1700
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3264 diff:-2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0

Not sure how that is possible... it's not supposed to reset!

I stand corrected, it IS volatile.

Volatile - Indicates that a register field value may be modified by hardware, firmware, or microcode
when fetching the first instruction and/or might have read or write side effects. No read may
depend on the results of a previous read and no write may be omitted based on the value of a
previous read or write.

What is the point of it then.. single user debugging? Can't infer anything from them.

cyring commented 4 years ago

Both turbo and non-turbo core-cycles appear OK

OK this is at least a good news

Maybe related... noticed the cli was really slow (much of it not rendering at all) after running overnight. corefreqd was taking 200% CPU. Still at 100% after a restart.

Please confirm, if source code is the latest. No mix with all version. Clean rebuild is:

make clean all

USER      PR  NI    RES    SHR S  P  %CPU %MEM     TIME+ COMMAND               
root      20   0   1.3m   1.1m S 16 100.1  0.0  12:52.11 corefreqd-cmgr

CPU is high in the Child process which manages the Stress threads (aka Tools in the Cli) I'm reviewing the code again ...

Hmm.. some fuckery somewhere.. would appear the weird values in corefreq coincides with negative values being reported even with rdmsr

# rd() { rdmsr -ad -p $1 0xC00000E9 |head -24 | head -1; } ;
#  { for i in {1..30}; do c=$(rd 31); sleep 0.1; l=$(rd 31);  printf  "last:%s current:%s diff:%s\n" "$l" "$c" $((l - c));    done  ; echo; }
last:369 current:369 diff:0
last:369 current:369 diff:0
last:3264 current:369 diff:2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3264 diff:-2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:368 current:369 diff:-1
last:369 current:3253 diff:-2884
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3253 diff:-2884
last:13965 current:369 diff:13596
last:3253 current:369 diff:2884
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3253 diff:-2884
last:369 current:369 diff:0
last:2069 current:369 diff:1700
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:3264 diff:-2895
last:369 current:369 diff:0
last:369 current:369 diff:0
last:369 current:369 diff:0

Not sure how that is possible... it's not supposed to reset!

How can the Instructions fall down to zero. Can a CPU do that !

cyring commented 4 years ago

I stand corrected, it IS volatile.

Volatile - Indicates that a register field value may be modified by hardware, firmware, or microcode
when fetching the first instruction and/or might have read or write side effects. No read may
depend on the results of a previous read and no write may be omitted based on the value of a
previous read or write.

What is the point of it then.. single user debugging? Can't infer anything from them.

You mean I should have declare some variables volatile in the code ?

EDIT: Volatile, true, when programming machine code. https://github.com/cyring/CoreFreq/blob/1887c7b554288e3e9e1f359c657c94a68b832be7/corefreqk.h#L96

But in a script, using rdmsr, there are too much instructions involved to store a counter and have an out of order side effect.

cyring commented 4 years ago

Are you getting the same issue when booting with the CoreFreq Live Image

Notes: Image embeds an outdated version of CoreFreq but you can update to latest version with command corefreq_online_update next corefreq_build_run

h1z1 commented 4 years ago

Please confirm, if source code is the latest. No mix with all version.
# cat .git/refs/heads/develop 
1887c7b554288e3e9e1f359c657c94a68b832be7
You mean I should have declare some variables volatile in the code ?

I would hope the value is always a number at least, but it's not monotonic. It's an unreliable value. Except while in turbo?

Re the CPU use of corefreqd it went away after reloading the kernel module.

cyring commented 4 years ago

Just in case: is the microcode the latest available ?

h1z1 commented 4 years ago

AFAIK.. Unless it's a BIOS//AGESA issue.

F11e BIOS
cat /sys/devices/system/cpu/cpu0/microcode/version 
0x8001137

cyring commented 4 years ago

AFAIK.. Unless it's a BIOS//AGESA issue.
F11e BIOS
cat /sys/devices/system/cpu/cpu0/microcode/version 
0x8001137

One of these BIOS versions ?

Update AGESA 1.1.0.1a

Please try CoreFreq with a stable BIOS

h1z1 commented 4 years ago

Correct, the bios is running F11e which has AGESA 1.1.01a. There were reports of problems with F12

For shits'n giggles I updated that box to F12, no diff.

cyring commented 4 years ago

Correct, the bios is running F11e which has AGESA 1.1.01a. There were reports of problems with F12

For shits'n giggles I updated that box to F12, no diff.

I've asked other Zen1 users for the same test; so far no answer.

At least I would like to know if a simple Ryzen 1000 also has an issue with the instructions counter. Because I'm stuck now.

h1z1 commented 4 years ago

I would expect them to based on the documentation from AMD. Maybe they aren't as erratic though. There's no indication the value is even wrapping, it's just bogus / random.

cyring commented 4 years ago

I would expect them to based on the documentation from AMD. Maybe they aren't as erratic though. There's no indication the value is even wrapping, it's just bogus / random.

Among all my recent searches, I read s/o is rather programming one of the general PMC counters to count the retired instructions, rather than using the IRPerf register, because it is buggy. Unfortunately I did not bookmark that page.

Anyway this issue can't be unique to Threadripper but shared among the whole architecture, including EPYC. Do you ?

h1z1 commented 4 years ago

Agreed. Might be worth looking at how perf is counting cycles? My guess is they're discarding bogus values but how do you interpolate anything if you can't trust the value at all.

cyring commented 4 years ago

Confirmed the same issue here

Beside asking help from the Manufacturer for a new microcode, feel free to close or not this GitHub issue.

Regards, CyrIng

h1z1 commented 4 years ago

@cyring can you comfirm you were testing with and without corefreqk being loaded? Just noticed something odd in that rdmsr works fine without the corefreqk module.

That isn't to say corefreq is at fault but something during the setup may be a trigger? May even be something stupid like xfr itself.

cyring commented 4 years ago

@cyring can you comfirm you were testing with and without corefreqk being loaded? Just noticed something odd in that rdmsr works fine without the corefreqk module.

That isn't to say corefreq is at fault but something during the setup may be a trigger? May even be something stupid like xfr itself.

It's very good hint ! I bet I have messed up with some registers. MSR_K7_HWCR is the register where I'm activating the Instruction counter: https://github.com/cyring/CoreFreq/blob/933df839bf494061a9488bb600a2fff18bdf6731/corefreqk.c#L8544 Please post the output of this register as soon as counters are OK

rdmsr -aX 0xc0010015

Thank you

EDIT: there's a slight variation between EPYC and Ryzen: EPYC is a per CCD register

EPYC
Ryzen

h1z1 commented 4 years ago

# rdmsr -aX 0xc0010015 | uniq -c
     32 49000011
# modprobe -r corefreqk
# rdmsr -aX 0xc0010015 | uniq -c
     32 9000011
#

Plot thickens: disabling C6 works around it too (CC6_Enable=0 PC6_Enable=0)

Early ryzen had issues with C6 but afaik those surfaced as hard lockups, I don't see any reference to other instabilities though I've long suspected there was other errata.

cyring commented 4 years ago

Do you confirm when the HWCR has the value 0x9000011 you are able to read good counters ? If true, it means the bit 30 for IRPerfEn does not have to be set !

h1z1 commented 4 years ago

Correct though I'm assuming that bit should have flipped back on exit?

This is such a cluster fuck because even more odd though is the read doesn't change at all after.

# # lsmod |grep corefreq
# 
# ./wtftest.sh 1
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
last:18568480 current:18568480 diff:0
# ./c6 enable
# ./wtftest.sh 1
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
last:0 current:0 diff:0
#

cyring commented 4 years ago

When inserting the CoreFreq kernel module Control Registers are stored into a Save Area
When removing the module Registers are restored from this area

Do you confirm no other Zen software is changing registers on behalf of CoreFreq ?
Can you provide me an access to debug online ?

cyring commented 3 years ago

Are we facing those Errata !

2020-11-10-131100_747x413_scrot

2020-11-10-130845_743x510_scrot

Can you disable CC6 and check again ? 2020-11-10-131312_644x550_scrot

h1z1 commented 3 years ago

Holyshit that is some serious erata. No fix from them?? Yikes. Weird thing is it doesn't appear load related at all given it surfaces on cpu's that are IDLE. If it is overflowing doesn't that throw turbo completely out the window too? What a clusterfuck.

cyring commented 3 years ago

Except if disabling CC6 makes things better (see previous comment), feel free to close the issue. Cyril

cyring commented 3 years ago

AMD Errata

cyring commented 2 years ago

News

Disabling Global C-States in this EPYC BIOS is fixing the bogus Instructions counter.
The trade-off is a temperature increase while idling.

2021-11-22-233630_644x550_scrot

cyring / CoreFreq

Bogus instruction count values reported #206