klauspost / cpuid

CPU feature identification for Go
MIT License
1.01k stars 125 forks source link

Process hangs for an hour on init #103

Closed zacaway closed 2 years ago

zacaway commented 2 years ago

Found this as it affects minio, which uses cpuid, but it is easily reproducible just using the example program. Running that example program on my machine hangs for about an hour, then the program continues normally. Seems to be due to this for-loop which loops about 4 billion times on my machine.

Here is my CPU spec (from /proc/cpuinfo) if it helps:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 25
model       : 80
model name  : AMD Ryzen 7 5700G with Radeon Graphics
stepping    : 0
microcode   : 0xa50000c
cpu MHz     : 3792.874
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cr8_legacy abm sse4a misalignsse 3dnowprefetch bpext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat umip pku ospke vaes vpclmulqdq rdpid
bugs        : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 7585.74
TLB size    : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : AuthenticAMD
cpu family  : 25
model       : 80
model name  : AMD Ryzen 7 5700G with Radeon Graphics
stepping    : 0
microcode   : 0xa50000c
cpu MHz     : 3792.874
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 2
cpu cores   : 4
apicid      : 2
initial apicid  : 2
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cr8_legacy abm sse4a misalignsse 3dnowprefetch bpext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat umip pku ospke vaes vpclmulqdq rdpid
bugs        : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 7608.95
TLB size    : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : AuthenticAMD
cpu family  : 25
model       : 80
model name  : AMD Ryzen 7 5700G with Radeon Graphics
stepping    : 0
microcode   : 0xa50000c
cpu MHz     : 3792.874
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 4
cpu cores   : 4
apicid      : 4
initial apicid  : 4
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cr8_legacy abm sse4a misalignsse 3dnowprefetch bpext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat umip pku ospke vaes vpclmulqdq rdpid
bugs        : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 7618.22
TLB size    : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : AuthenticAMD
cpu family  : 25
model       : 80
model name  : AMD Ryzen 7 5700G with Radeon Graphics
stepping    : 0
microcode   : 0xa50000c
cpu MHz     : 3792.874
cache size  : 512 KB
physical id : 0
siblings    : 4
core id     : 6
cpu cores   : 4
apicid      : 6
initial apicid  : 6
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cr8_legacy abm sse4a misalignsse 3dnowprefetch bpext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat umip pku ospke vaes vpclmulqdq rdpid
bugs        : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 7608.25
TLB size    : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:
klauspost commented 2 years ago

@zacaway Are you running in a VM?

zacaway commented 2 years ago

Yes, I am running under Xen.

klauspost commented 2 years ago

It must have a buggy implementation of its cpuid emulation, then.

Is it possible for you to insert this:

@@ -848,6 +858,11 @@ func (c *CPUInfo) cacheSize() {
            if typ == 0 {
                return
            }
+           fmt.Printf("EAX: 0x%08x, EBX: 0x%08x, ECX: 0x%08x; ", eax, ebx, ecx)
+           fmt.Println("type", typ, "size:", size)
+           if i > 100 {
+               return
+           }

            switch level {
            case 1:

I need a secondary break condition.

The docs specify that EAX low 4==0 bits indicates that there are no more (page 33).

klauspost commented 2 years ago

(I will need the output to look at a fix)

zacaway commented 2 years ago

Hi, here is the output (from running cmd/cpuid/main.go with that patch):

EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
EAX: 0x00004121, EBX: 0x01c0003f, ECX: 0x0000003f; type 1 size: 32768
Name: AMD Ryzen 7 5700G with Radeon Graphics
Vendor String: AuthenticAMD
Vendor ID: AMD
PhysicalCores: 16
Threads Per Core: 2
Logical Cores: 32
CPU Family 25 Model: 80
Features: ADX,AESNI,AVX,AVX2,BMI1,BMI2,CLMUL,CLZERO,CMOV,CMPXCHG8,CX16,ERMS,F16C,FMA3,FXSR,FXSROPT,HTT,HYPERVISOR,LAHF,LZCNT,MMX,MMXEXT,MOVBE,MSR_PAGEFLUSH,NX,OSXSAVE,POPCNT,RDRAND,RDSEED,RDTSCP,SCE,SEV,SEV_64BIT,SEV_ALTERNATIVE,SEV_DEBUGSWAP,SEV_ES,SEV_RESTRICTED,SHA,SME,SSE,SSE2,SSE3,SSE4,SSE42,SSE4A,SSSE3,VTE,X87,XGETBV1,XSAVE,XSAVEC,XSAVEOPT,XSAVES
Microarchitecture level: 3
Cacheline bytes: 64
L1 Instruction Cache: 32768 bytes
L1 Data Cache: 32768 bytes
L2 Cache: 524288 bytes
L3 Cache: -1 bytes

Thanks

klauspost commented 2 years ago

ok, this is buggy, unless you have endless level 1 data cache :)

Seems like it just ignores ECX value given, which should return different cache levels.

zacaway commented 2 years ago

Great, thanks. Yes, I don't think I have an infinite cache unfortunately :)

klauspost commented 2 years ago

Thanks for the help.

You can try the cpuid of the latest release: https://github.com/klauspost/cpuid/releases/tag/v2.0.14 - should catch your case.

I will upstream if you can confirm it works.

zacaway commented 2 years ago

I've tested v2.0.14 and can confirm it is working. Many thanks!