Closed tmds closed 4 years ago
Tagging subscribers to this area: @tannergooding, @pgovind See info in area-owners.md if you want to be subscribed.
It looks like the test failures we see in Json test suites (https://github.com/dotnet/runtime/issues/41582) are also related to vectorization.
@tannergooding I think this may have regressed in https://github.com/dotnet/runtime/pull/40167.
Cc @jeffhandley
What's the easiest way to try and repro the failure for RedHat? I'm a bit surprised we're seeing a failure for RedHat and not any other Unix distro, is there something unique that the build is doing or the hardware the machine runs on?
I guess capabilities of the CPU aren't properly understood. If you have a look at the tests that fail, maybe it gives you an idea what capabilities they use.
There aren't any special things about our CI setup. It involves VMs running the build+tests.
This is the output of cpuid
in the VM:
CPU 0: vendor_id = "GenuineIntel" version information (1/eax): processor type = primary processor (0) family = 0x6 (6) model = 0xd (13) stepping id = 0x2 (2) extended family = 0x0 (0) extended model = 0x3 (3) (family synth) = 0x6 (6) (model synth) = 0x3d (61) (simple synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm miscellaneous (1/ebx): process local APIC physical ID = 0x0 (0) maximum IDs for CPUs in pkg = 0x0 (0) CLFLUSH line size = 0x8 (8) brand index = 0x0 (0) brand id = 0x00 (0): unknown feature information (1/edx): x87 FPU on chip = true VME: virtual-8086 mode enhancement = true DE: debugging extensions = true PSE: page size extensions = true TSC: time stamp counter = true RDMSR and WRMSR support = true PAE: physical address extensions = true MCE: machine check exception = true CMPXCHG8B inst. = true APIC on chip = true SYSENTER and SYSEXIT = true MTRR: memory type range registers = true PTE global bit = true MCA: machine check architecture = true CMOV: conditional move/compare instr = true PAT: page attribute table = true PSE-36: page size extension = true PSN: processor serial number = false CLFLUSH instruction = true DS: debug store = false ACPI: thermal monitor and clock ctrl = false MMX Technology = true FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true SS: self snoop = true hyper-threading / multi-core supported = false TM: therm. monitor = false IA64 = false PBE: pending break event = false feature information (1/ecx): PNI/SSE3: Prescott New Instructions = true PCLMULDQ instruction = true DTES64: 64-bit debug store = false MONITOR/MWAIT = false CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = false TM2: thermal monitor 2 = false SSSE3 extensions = true context ID: adaptive or shared L1 data = false SDBG: IA32_DEBUG_INTERFACE = false FMA instruction = true CMPXCHG16B instruction = true xTPR disable = false PDCM: perfmon and debug = false PCID: process context identifiers = true DCA: direct cache access = false SSE4.1 extensions = true SSE4.2 extensions = true x2APIC: extended xAPIC support = true MOVBE instruction = true POPCNT instruction = true time stamp counter deadline = true AES instruction = true XSAVE/XSTOR states = true OS-enabled XSAVE/XSTOR = true AVX: advanced vector extensions = true F16C half-precision convert instruction = true RDRAND instruction = true hypervisor guest status = true cache and TLB information (2): 0x7d: L2 cache: 2M, 8-way, 64 byte lines 0x30: L1 cache: 32K, 8-way, 64 byte lines 0x2c: L1 data cache: 32K, 8-way, 64 byte lines processor serial number = 0003-06D2-0000-0000-0000-0000 deterministic cache parameters (4): --- cache 0 --- cache type = data cache (1) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 1 --- cache type = instruction cache (2) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 2 --- cache type = unified cache (3) cache level = 0x2 (2) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x1000 (4096) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 4096 (size synth) = 4194304 (4 MB) MONITOR/MWAIT (5): smallest monitor-line size (bytes) = 0x0 (0) largest monitor-line size (bytes) = 0x0 (0) enum of Monitor-MWAIT exts supported = true supports intrs as break-event for MWAIT = true number of C0 sub C-states using MWAIT = 0x0 (0) number of C1 sub C-states using MWAIT = 0x0 (0) number of C2 sub C-states using MWAIT = 0x0 (0) number of C3 sub C-states using MWAIT = 0x0 (0) number of C4 sub C-states using MWAIT = 0x0 (0) number of C5 sub C-states using MWAIT = 0x0 (0) number of C6 sub C-states using MWAIT = 0x0 (0) number of C7 sub C-states using MWAIT = 0x0 (0) Thermal and Power Management Features (6): digital thermometer = false Intel Turbo Boost Technology = false ARAT always running APIC timer = false PLN power limit notification = false ECMD extended clock modulation duty = false PTM package thermal management = false HWP base registers = false HWP notification = false HWP activity window = false HWP energy performance preference = false HWP package level request = false HDC base registers = false Intel Turbo Boost Max Technology 3.0 = false HWP capabilities = false HWP PECI override = false flexible HWP = false IA32_HWP_REQUEST MSR fast access mode = false HW_FEEDBACK = false ignoring idle logical processor HWP req = false digital thermometer thresholds = 0x0 (0) hardware coordination feedback = false ACNT2 available = false performance-energy bias capability = false performance capability reporting = false energy efficiency capability reporting = false size of feedback struct (4KB pages) = 0x0 (0) index of CPU's row in feedback struct = 0x0 (0) extended feature flags (7): FSGSBASE instructions = true IA32_TSC_ADJUST MSR supported = false SGX: Software Guard Extensions supported = false BMI1 instructions = true HLE hardware lock elision = true AVX2: advanced vector extensions 2 = true FDP_EXCPTN_ONLY = false SMEP supervisor mode exec protection = true BMI2 instructions = true enhanced REP MOVSB/STOSB = true INVPCID instruction = true RTM: restricted transactional memory = true RDT-CMT/PQoS cache monitoring = false deprecated FPU CS/DS = false MPX: intel memory protection extensions = false RDT-CAT/PQE cache allocation = false AVX512F: AVX-512 foundation instructions = false AVX512DQ: double & quadword instructions = false RDSEED instruction = true ADX instructions = true SMAP: supervisor mode access prevention = true AVX512IFMA: fused multiply add = false PCOMMIT instruction = false CLFLUSHOPT instruction = false CLWB instruction = false Intel processor trace = false AVX512PF: prefetch instructions = false AVX512ER: exponent & reciprocal instrs = false AVX512CD: conflict detection instrs = false SHA instructions = false AVX512BW: byte & word instructions = false AVX512VL: vector length = false PREFETCHWT1 = false AVX512VBMI: vector byte manipulation = false UMIP: user-mode instruction prevention = false PKU protection keys for user-mode = false OSPKE CR4.PKE and RDPKRU/WRPKRU = false WAITPKG instructions = false AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND = false CET_SS: CET shadow stack = false GFNI: Galois Field New Instructions = false VAES instructions = false VPCLMULQDQ instruction = false AVX512_VNNI: neural network instructions = false AVX512_BITALG: bit count/shiffle = false TME: Total Memory Encryption = false AVX512: VPOPCNTDQ instruction = false 5-level paging = false BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0) RDPID: read processor D supported = false CLDEMOTE supports cache line demote = false MOVDIRI instruction = false MOVDIR64B instruction = false ENQCMD instruction = false SGX_LC: SGX launch config supported = false AVX512_4VNNIW: neural network instrs = false AVX512_4FMAPS: multiply acc single prec = false fast short REP MOV = false AVX512_VP2INTERSECT: intersect mask regs = false VERW md-clear microcode support = false SERIALIZE = false hybrid part = false TSXLDTRK: TSX suspend load addr tracking = false PCONFIG instruction = false CET_IBT: CET indirect branch tracking = false IBRS/IBPB: indirect branch restrictions = false STIBP: 1 thr indirect branch predictor = false L1D_FLUSH: IA32_FLUSH_CMD MSR = false IA32_ARCH_CAPABILITIES MSR = false IA32_CORE_CAPABILITIES MSR = false SSBD: speculative store bypass disable = false Direct Cache Access Parameters (9): PLATFORM_DCA_CAP MSR bits = 0 Architecture Performance Monitoring Features (0xa/eax): version ID = 0x0 (0) number of counters per logical processor = 0x0 (0) bit width of counter = 0x0 (0) length of EBX bit vector = 0x0 (0) Architecture Performance Monitoring Features (0xa/ebx): core cycle event not available = false instruction retired event not available = false reference cycles event not available = false last-level cache ref event not available = false last-level cache miss event not avail = false branch inst retired event not available = false branch mispred retired event not avail = false Architecture Performance Monitoring Features (0xa/edx): number of fixed counters = 0x0 (0) bit width of fixed counters = 0x0 (0) anythread deprecation = false XSAVE features (0xd/0): XCR0 lower 32 bits valid bit field mask = 0x00000007 XCR0 upper 32 bits valid bit field mask = 0x00000000 XCR0 supported: x87 state = true XCR0 supported: SSE state = true XCR0 supported: AVX state = true XCR0 supported: MPX BNDREGS = false XCR0 supported: MPX BNDCSR = false XCR0 supported: AVX-512 opmask = false XCR0 supported: AVX-512 ZMM_Hi256 = false XCR0 supported: AVX-512 Hi16_ZMM = false IA32_XSS supported: PT state = false XCR0 supported: PKRU state = false XCR0 supported: CET_U state = false XCR0 supported: CET_S state = false IA32_XSS supported: HDC state = false bytes required by fields in XCR0 = 0x00000340 (832) bytes required by XSAVE/XRSTOR area = 0x00000340 (832) XSAVE features (0xd/1): XSAVEOPT instruction = true XSAVEC instruction = false XGETBV instruction = false XSAVES/XRSTORS instructions = false SAVE area size in bytes = 0x00000000 (0) IA32_XSS lower 32 bits valid bit field mask = 0x00000000 IA32_XSS upper 32 bits valid bit field mask = 0x00000000 AVX/YMM features (0xd/2): AVX/YMM save state byte size = 0x00000100 (256) AVX/YMM save state byte offset = 0x00000240 (576) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false hypervisor_id = "KVMKVMKVM " hypervisor features (0x40000001/eax): kvmclock available at MSR 0x11 = true delays unnecessary for PIO ops = true mmu_op = false kvmclock available at MSR 0x4b564d00 = true async pf enable available by MSR = true steal clock supported = true guest EOI optimization enabled = true guest spinlock optimization enabled = true guest TLB flush optimization enabled = false async PF VM exit enable available by MSR = false guest send IPI optimization enabled = false host HLT poll disable at MSR 0x4b564d05 = false guest sched yield optimization enabled = false stable: no guest per-cpu warps expected = true hypervisor features (0x40000001/edx): realtime hint: no unbound preemption = true extended feature flags (0x80000001/edx): SYSCALL and SYSRET instructions = true execution disable = true 1-GB large page support = true RDTSCP = true 64-bit extensions technology available = true Intel feature flags (0x80000001/ecx): LAHF/SAHF supported in 64-bit mode = true LZCNT advanced bit manipulation = true 3DNow! PREFETCH/PREFETCHW instructions = true brand = "Intel Core Processor (Broadwell)" L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 data cache information (0x80000005/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L1 instruction cache information (0x80000005/edx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax): instruction # entries = 0x0 (0) instruction associativity = L2 off (0) data # entries = 0x0 (0) data associativity = L2 off (0) L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx): instruction # entries = 0x200 (512) instruction associativity = 4-way (4) data # entries = 0x200 (512) data associativity = 4-way (4) L2 unified cache information (0x80000006/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 16-way (8) size (KB) = 0x200 (512) L3 cache information (0x80000006/edx): line size (bytes) = 0x0 (0) lines per tag = 0x0 (0) associativity = L2 off (0) size (in 512KB units) = 0x0 (0) RAS Capability (0x80000007/ebx): MCA overflow recovery support = false SUCCOR support = false HWA: hardware assert support = false scalable MCA support = false Advanced Power Management Features (0x80000007/ecx): CmpUnitPwrSampleTimeRatio = 0x0 (0) Advanced Power Management Features (0x80000007/edx): TS: temperature sensing diode = false FID: frequency ID control = false VID: voltage ID control = false TTP: thermal trip = false TM: thermal monitor = false STC: software thermal control = false 100 MHz multiplier control = false hardware P-State control = false TscInvariant = false CPB: core performance boost = false read-only effective frequency interface = false processor feedback interface = false APM power reporting = false connected standby = false RAPL: running average power limit = false Physical Address and Linear Address Size (0x80000008/eax): maximum physical address bits = 0x2e (46) maximum linear (virtual) address bits = 0x30 (48) maximum guest physical address bits = 0x0 (0) Extended Feature Extensions ID (0x80000008/ebx): CLZERO instruction = false instructions retired count support = false always save/restore error pointers = false RDPRU instruction = false memory bandwidth enforcement = false WBNOINVD instruction = false IBPB: indirect branch prediction barrier = false IBRS: indirect branch restr speculation = false STIBP: 1 thr indirect branch predictor = false STIBP always on preferred mode = false ppin processor id number supported = false SSBD: speculative store bypass disable = false virtualized SSBD = false SSBD fixed in hardware = false Size Identifiers (0x80000008/ecx): number of CPU cores = 0x1 (1) ApicIdCoreIdSize = 0x0 (0) performance time-stamp counter size = 0x0 (0) Feature Extended Size (0x80000008/edx): RDPRU instruction max input support = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/eax): SvmRev: SVM revision = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/edx): nested paging = false LBR virtualization = false SVM lock = false NRIP save = false MSR based TSC rate control = false VMCB clean bits support = false flush by ASID = false decode assists = false SSSE3/SSE5 opcode set disable = false pause intercept filter = false pause filter threshold = false AVIC: AMD virtual interrupt controller = false virtualized VMLOAD/VMSAVE = false virtualized global interrupt flag (GIF) = false GMET: guest mode execute trap = false guest Spec_ctl support = false NASID: number of address space identifiers = 0x0 (0): (multi-processing synth) = none (multi-processing method) = Intel leaf 1/4 (APIC widths synth): CORE_width=0 SMT_width=0 (APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=0 (uarch synth) = Intel Broadwell {Haswell}, 14nm (synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm CPU 1: vendor_id = "GenuineIntel" version information (1/eax): processor type = primary processor (0) family = 0x6 (6) model = 0xd (13) stepping id = 0x2 (2) extended family = 0x0 (0) extended model = 0x3 (3) (family synth) = 0x6 (6) (model synth) = 0x3d (61) (simple synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm miscellaneous (1/ebx): process local APIC physical ID = 0x1 (1) maximum IDs for CPUs in pkg = 0x0 (0) CLFLUSH line size = 0x8 (8) brand index = 0x0 (0) brand id = 0x00 (0): unknown feature information (1/edx): x87 FPU on chip = true VME: virtual-8086 mode enhancement = true DE: debugging extensions = true PSE: page size extensions = true TSC: time stamp counter = true RDMSR and WRMSR support = true PAE: physical address extensions = true MCE: machine check exception = true CMPXCHG8B inst. = true APIC on chip = true SYSENTER and SYSEXIT = true MTRR: memory type range registers = true PTE global bit = true MCA: machine check architecture = true CMOV: conditional move/compare instr = true PAT: page attribute table = true PSE-36: page size extension = true PSN: processor serial number = false CLFLUSH instruction = true DS: debug store = false ACPI: thermal monitor and clock ctrl = false MMX Technology = true FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true SS: self snoop = true hyper-threading / multi-core supported = false TM: therm. monitor = false IA64 = false PBE: pending break event = false feature information (1/ecx): PNI/SSE3: Prescott New Instructions = true PCLMULDQ instruction = true DTES64: 64-bit debug store = false MONITOR/MWAIT = false CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = false TM2: thermal monitor 2 = false SSSE3 extensions = true context ID: adaptive or shared L1 data = false SDBG: IA32_DEBUG_INTERFACE = false FMA instruction = true CMPXCHG16B instruction = true xTPR disable = false PDCM: perfmon and debug = false PCID: process context identifiers = true DCA: direct cache access = false SSE4.1 extensions = true SSE4.2 extensions = true x2APIC: extended xAPIC support = true MOVBE instruction = true POPCNT instruction = true time stamp counter deadline = true AES instruction = true XSAVE/XSTOR states = true OS-enabled XSAVE/XSTOR = true AVX: advanced vector extensions = true F16C half-precision convert instruction = true RDRAND instruction = true hypervisor guest status = true cache and TLB information (2): 0x7d: L2 cache: 2M, 8-way, 64 byte lines 0x30: L1 cache: 32K, 8-way, 64 byte lines 0x2c: L1 data cache: 32K, 8-way, 64 byte lines processor serial number = 0003-06D2-0000-0000-0000-0000 deterministic cache parameters (4): --- cache 0 --- cache type = data cache (1) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 1 --- cache type = instruction cache (2) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 2 --- cache type = unified cache (3) cache level = 0x2 (2) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x1000 (4096) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 4096 (size synth) = 4194304 (4 MB) MONITOR/MWAIT (5): smallest monitor-line size (bytes) = 0x0 (0) largest monitor-line size (bytes) = 0x0 (0) enum of Monitor-MWAIT exts supported = true supports intrs as break-event for MWAIT = true number of C0 sub C-states using MWAIT = 0x0 (0) number of C1 sub C-states using MWAIT = 0x0 (0) number of C2 sub C-states using MWAIT = 0x0 (0) number of C3 sub C-states using MWAIT = 0x0 (0) number of C4 sub C-states using MWAIT = 0x0 (0) number of C5 sub C-states using MWAIT = 0x0 (0) number of C6 sub C-states using MWAIT = 0x0 (0) number of C7 sub C-states using MWAIT = 0x0 (0) Thermal and Power Management Features (6): digital thermometer = false Intel Turbo Boost Technology = false ARAT always running APIC timer = false PLN power limit notification = false ECMD extended clock modulation duty = false PTM package thermal management = false HWP base registers = false HWP notification = false HWP activity window = false HWP energy performance preference = false HWP package level request = false HDC base registers = false Intel Turbo Boost Max Technology 3.0 = false HWP capabilities = false HWP PECI override = false flexible HWP = false IA32_HWP_REQUEST MSR fast access mode = false HW_FEEDBACK = false ignoring idle logical processor HWP req = false digital thermometer thresholds = 0x0 (0) hardware coordination feedback = false ACNT2 available = false performance-energy bias capability = false performance capability reporting = false energy efficiency capability reporting = false size of feedback struct (4KB pages) = 0x0 (0) index of CPU's row in feedback struct = 0x0 (0) extended feature flags (7): FSGSBASE instructions = true IA32_TSC_ADJUST MSR supported = false SGX: Software Guard Extensions supported = false BMI1 instructions = true HLE hardware lock elision = true AVX2: advanced vector extensions 2 = true FDP_EXCPTN_ONLY = false SMEP supervisor mode exec protection = true BMI2 instructions = true enhanced REP MOVSB/STOSB = true INVPCID instruction = true RTM: restricted transactional memory = true RDT-CMT/PQoS cache monitoring = false deprecated FPU CS/DS = false MPX: intel memory protection extensions = false RDT-CAT/PQE cache allocation = false AVX512F: AVX-512 foundation instructions = false AVX512DQ: double & quadword instructions = false RDSEED instruction = true ADX instructions = true SMAP: supervisor mode access prevention = true AVX512IFMA: fused multiply add = false PCOMMIT instruction = false CLFLUSHOPT instruction = false CLWB instruction = false Intel processor trace = false AVX512PF: prefetch instructions = false AVX512ER: exponent & reciprocal instrs = false AVX512CD: conflict detection instrs = false SHA instructions = false AVX512BW: byte & word instructions = false AVX512VL: vector length = false PREFETCHWT1 = false AVX512VBMI: vector byte manipulation = false UMIP: user-mode instruction prevention = false PKU protection keys for user-mode = false OSPKE CR4.PKE and RDPKRU/WRPKRU = false WAITPKG instructions = false AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND = false CET_SS: CET shadow stack = false GFNI: Galois Field New Instructions = false VAES instructions = false VPCLMULQDQ instruction = false AVX512_VNNI: neural network instructions = false AVX512_BITALG: bit count/shiffle = false TME: Total Memory Encryption = false AVX512: VPOPCNTDQ instruction = false 5-level paging = false BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0) RDPID: read processor D supported = false CLDEMOTE supports cache line demote = false MOVDIRI instruction = false MOVDIR64B instruction = false ENQCMD instruction = false SGX_LC: SGX launch config supported = false AVX512_4VNNIW: neural network instrs = false AVX512_4FMAPS: multiply acc single prec = false fast short REP MOV = false AVX512_VP2INTERSECT: intersect mask regs = false VERW md-clear microcode support = false SERIALIZE = false hybrid part = false TSXLDTRK: TSX suspend load addr tracking = false PCONFIG instruction = false CET_IBT: CET indirect branch tracking = false IBRS/IBPB: indirect branch restrictions = false STIBP: 1 thr indirect branch predictor = false L1D_FLUSH: IA32_FLUSH_CMD MSR = false IA32_ARCH_CAPABILITIES MSR = false IA32_CORE_CAPABILITIES MSR = false SSBD: speculative store bypass disable = false Direct Cache Access Parameters (9): PLATFORM_DCA_CAP MSR bits = 0 Architecture Performance Monitoring Features (0xa/eax): version ID = 0x0 (0) number of counters per logical processor = 0x0 (0) bit width of counter = 0x0 (0) length of EBX bit vector = 0x0 (0) Architecture Performance Monitoring Features (0xa/ebx): core cycle event not available = false instruction retired event not available = false reference cycles event not available = false last-level cache ref event not available = false last-level cache miss event not avail = false branch inst retired event not available = false branch mispred retired event not avail = false Architecture Performance Monitoring Features (0xa/edx): number of fixed counters = 0x0 (0) bit width of fixed counters = 0x0 (0) anythread deprecation = false XSAVE features (0xd/0): XCR0 lower 32 bits valid bit field mask = 0x00000007 XCR0 upper 32 bits valid bit field mask = 0x00000000 XCR0 supported: x87 state = true XCR0 supported: SSE state = true XCR0 supported: AVX state = true XCR0 supported: MPX BNDREGS = false XCR0 supported: MPX BNDCSR = false XCR0 supported: AVX-512 opmask = false XCR0 supported: AVX-512 ZMM_Hi256 = false XCR0 supported: AVX-512 Hi16_ZMM = false IA32_XSS supported: PT state = false XCR0 supported: PKRU state = false XCR0 supported: CET_U state = false XCR0 supported: CET_S state = false IA32_XSS supported: HDC state = false bytes required by fields in XCR0 = 0x00000340 (832) bytes required by XSAVE/XRSTOR area = 0x00000340 (832) XSAVE features (0xd/1): XSAVEOPT instruction = true XSAVEC instruction = false XGETBV instruction = false XSAVES/XRSTORS instructions = false SAVE area size in bytes = 0x00000000 (0) IA32_XSS lower 32 bits valid bit field mask = 0x00000000 IA32_XSS upper 32 bits valid bit field mask = 0x00000000 AVX/YMM features (0xd/2): AVX/YMM save state byte size = 0x00000100 (256) AVX/YMM save state byte offset = 0x00000240 (576) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false hypervisor_id = "KVMKVMKVM " hypervisor features (0x40000001/eax): kvmclock available at MSR 0x11 = true delays unnecessary for PIO ops = true mmu_op = false kvmclock available at MSR 0x4b564d00 = true async pf enable available by MSR = true steal clock supported = true guest EOI optimization enabled = true guest spinlock optimization enabled = true guest TLB flush optimization enabled = false async PF VM exit enable available by MSR = false guest send IPI optimization enabled = false host HLT poll disable at MSR 0x4b564d05 = false guest sched yield optimization enabled = false stable: no guest per-cpu warps expected = true hypervisor features (0x40000001/edx): realtime hint: no unbound preemption = true extended feature flags (0x80000001/edx): SYSCALL and SYSRET instructions = true execution disable = true 1-GB large page support = true RDTSCP = true 64-bit extensions technology available = true Intel feature flags (0x80000001/ecx): LAHF/SAHF supported in 64-bit mode = true LZCNT advanced bit manipulation = true 3DNow! PREFETCH/PREFETCHW instructions = true brand = "Intel Core Processor (Broadwell)" L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 data cache information (0x80000005/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L1 instruction cache information (0x80000005/edx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax): instruction # entries = 0x0 (0) instruction associativity = L2 off (0) data # entries = 0x0 (0) data associativity = L2 off (0) L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx): instruction # entries = 0x200 (512) instruction associativity = 4-way (4) data # entries = 0x200 (512) data associativity = 4-way (4) L2 unified cache information (0x80000006/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 16-way (8) size (KB) = 0x200 (512) L3 cache information (0x80000006/edx): line size (bytes) = 0x0 (0) lines per tag = 0x0 (0) associativity = L2 off (0) size (in 512KB units) = 0x0 (0) RAS Capability (0x80000007/ebx): MCA overflow recovery support = false SUCCOR support = false HWA: hardware assert support = false scalable MCA support = false Advanced Power Management Features (0x80000007/ecx): CmpUnitPwrSampleTimeRatio = 0x0 (0) Advanced Power Management Features (0x80000007/edx): TS: temperature sensing diode = false FID: frequency ID control = false VID: voltage ID control = false TTP: thermal trip = false TM: thermal monitor = false STC: software thermal control = false 100 MHz multiplier control = false hardware P-State control = false TscInvariant = false CPB: core performance boost = false read-only effective frequency interface = false processor feedback interface = false APM power reporting = false connected standby = false RAPL: running average power limit = false Physical Address and Linear Address Size (0x80000008/eax): maximum physical address bits = 0x2e (46) maximum linear (virtual) address bits = 0x30 (48) maximum guest physical address bits = 0x0 (0) Extended Feature Extensions ID (0x80000008/ebx): CLZERO instruction = false instructions retired count support = false always save/restore error pointers = false RDPRU instruction = false memory bandwidth enforcement = false WBNOINVD instruction = false IBPB: indirect branch prediction barrier = false IBRS: indirect branch restr speculation = false STIBP: 1 thr indirect branch predictor = false STIBP always on preferred mode = false ppin processor id number supported = false SSBD: speculative store bypass disable = false virtualized SSBD = false SSBD fixed in hardware = false Size Identifiers (0x80000008/ecx): number of CPU cores = 0x1 (1) ApicIdCoreIdSize = 0x0 (0) performance time-stamp counter size = 0x0 (0) Feature Extended Size (0x80000008/edx): RDPRU instruction max input support = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/eax): SvmRev: SVM revision = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/edx): nested paging = false LBR virtualization = false SVM lock = false NRIP save = false MSR based TSC rate control = false VMCB clean bits support = false flush by ASID = false decode assists = false SSSE3/SSE5 opcode set disable = false pause intercept filter = false pause filter threshold = false AVIC: AMD virtual interrupt controller = false virtualized VMLOAD/VMSAVE = false virtualized global interrupt flag (GIF) = false GMET: guest mode execute trap = false guest Spec_ctl support = false NASID: number of address space identifiers = 0x0 (0): (multi-processing synth) = none (multi-processing method) = Intel leaf 1/4 (APIC widths synth): CORE_width=0 SMT_width=0 (APIC synth): PKG_ID=1 CORE_ID=0 SMT_ID=0 (uarch synth) = Intel Broadwell {Haswell}, 14nm (synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm CPU 2: vendor_id = "GenuineIntel" version information (1/eax): processor type = primary processor (0) family = 0x6 (6) model = 0xd (13) stepping id = 0x2 (2) extended family = 0x0 (0) extended model = 0x3 (3) (family synth) = 0x6 (6) (model synth) = 0x3d (61) (simple synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm miscellaneous (1/ebx): process local APIC physical ID = 0x2 (2) maximum IDs for CPUs in pkg = 0x0 (0) CLFLUSH line size = 0x8 (8) brand index = 0x0 (0) brand id = 0x00 (0): unknown feature information (1/edx): x87 FPU on chip = true VME: virtual-8086 mode enhancement = true DE: debugging extensions = true PSE: page size extensions = true TSC: time stamp counter = true RDMSR and WRMSR support = true PAE: physical address extensions = true MCE: machine check exception = true CMPXCHG8B inst. = true APIC on chip = true SYSENTER and SYSEXIT = true MTRR: memory type range registers = true PTE global bit = true MCA: machine check architecture = true CMOV: conditional move/compare instr = true PAT: page attribute table = true PSE-36: page size extension = true PSN: processor serial number = false CLFLUSH instruction = true DS: debug store = false ACPI: thermal monitor and clock ctrl = false MMX Technology = true FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true SS: self snoop = true hyper-threading / multi-core supported = false TM: therm. monitor = false IA64 = false PBE: pending break event = false feature information (1/ecx): PNI/SSE3: Prescott New Instructions = true PCLMULDQ instruction = true DTES64: 64-bit debug store = false MONITOR/MWAIT = false CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = false TM2: thermal monitor 2 = false SSSE3 extensions = true context ID: adaptive or shared L1 data = false SDBG: IA32_DEBUG_INTERFACE = false FMA instruction = true CMPXCHG16B instruction = true xTPR disable = false PDCM: perfmon and debug = false PCID: process context identifiers = true DCA: direct cache access = false SSE4.1 extensions = true SSE4.2 extensions = true x2APIC: extended xAPIC support = true MOVBE instruction = true POPCNT instruction = true time stamp counter deadline = true AES instruction = true XSAVE/XSTOR states = true OS-enabled XSAVE/XSTOR = true AVX: advanced vector extensions = true F16C half-precision convert instruction = true RDRAND instruction = true hypervisor guest status = true cache and TLB information (2): 0x7d: L2 cache: 2M, 8-way, 64 byte lines 0x30: L1 cache: 32K, 8-way, 64 byte lines 0x2c: L1 data cache: 32K, 8-way, 64 byte lines processor serial number = 0003-06D2-0000-0000-0000-0000 deterministic cache parameters (4): --- cache 0 --- cache type = data cache (1) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 1 --- cache type = instruction cache (2) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 2 --- cache type = unified cache (3) cache level = 0x2 (2) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x1000 (4096) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 4096 (size synth) = 4194304 (4 MB) MONITOR/MWAIT (5): smallest monitor-line size (bytes) = 0x0 (0) largest monitor-line size (bytes) = 0x0 (0) enum of Monitor-MWAIT exts supported = true supports intrs as break-event for MWAIT = true number of C0 sub C-states using MWAIT = 0x0 (0) number of C1 sub C-states using MWAIT = 0x0 (0) number of C2 sub C-states using MWAIT = 0x0 (0) number of C3 sub C-states using MWAIT = 0x0 (0) number of C4 sub C-states using MWAIT = 0x0 (0) number of C5 sub C-states using MWAIT = 0x0 (0) number of C6 sub C-states using MWAIT = 0x0 (0) number of C7 sub C-states using MWAIT = 0x0 (0) Thermal and Power Management Features (6): digital thermometer = false Intel Turbo Boost Technology = false ARAT always running APIC timer = false PLN power limit notification = false ECMD extended clock modulation duty = false PTM package thermal management = false HWP base registers = false HWP notification = false HWP activity window = false HWP energy performance preference = false HWP package level request = false HDC base registers = false Intel Turbo Boost Max Technology 3.0 = false HWP capabilities = false HWP PECI override = false flexible HWP = false IA32_HWP_REQUEST MSR fast access mode = false HW_FEEDBACK = false ignoring idle logical processor HWP req = false digital thermometer thresholds = 0x0 (0) hardware coordination feedback = false ACNT2 available = false performance-energy bias capability = false performance capability reporting = false energy efficiency capability reporting = false size of feedback struct (4KB pages) = 0x0 (0) index of CPU's row in feedback struct = 0x0 (0) extended feature flags (7): FSGSBASE instructions = true IA32_TSC_ADJUST MSR supported = false SGX: Software Guard Extensions supported = false BMI1 instructions = true HLE hardware lock elision = true AVX2: advanced vector extensions 2 = true FDP_EXCPTN_ONLY = false SMEP supervisor mode exec protection = true BMI2 instructions = true enhanced REP MOVSB/STOSB = true INVPCID instruction = true RTM: restricted transactional memory = true RDT-CMT/PQoS cache monitoring = false deprecated FPU CS/DS = false MPX: intel memory protection extensions = false RDT-CAT/PQE cache allocation = false AVX512F: AVX-512 foundation instructions = false AVX512DQ: double & quadword instructions = false RDSEED instruction = true ADX instructions = true SMAP: supervisor mode access prevention = true AVX512IFMA: fused multiply add = false PCOMMIT instruction = false CLFLUSHOPT instruction = false CLWB instruction = false Intel processor trace = false AVX512PF: prefetch instructions = false AVX512ER: exponent & reciprocal instrs = false AVX512CD: conflict detection instrs = false SHA instructions = false AVX512BW: byte & word instructions = false AVX512VL: vector length = false PREFETCHWT1 = false AVX512VBMI: vector byte manipulation = false UMIP: user-mode instruction prevention = false PKU protection keys for user-mode = false OSPKE CR4.PKE and RDPKRU/WRPKRU = false WAITPKG instructions = false AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND = false CET_SS: CET shadow stack = false GFNI: Galois Field New Instructions = false VAES instructions = false VPCLMULQDQ instruction = false AVX512_VNNI: neural network instructions = false AVX512_BITALG: bit count/shiffle = false TME: Total Memory Encryption = false AVX512: VPOPCNTDQ instruction = false 5-level paging = false BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0) RDPID: read processor D supported = false CLDEMOTE supports cache line demote = false MOVDIRI instruction = false MOVDIR64B instruction = false ENQCMD instruction = false SGX_LC: SGX launch config supported = false AVX512_4VNNIW: neural network instrs = false AVX512_4FMAPS: multiply acc single prec = false fast short REP MOV = false AVX512_VP2INTERSECT: intersect mask regs = false VERW md-clear microcode support = false SERIALIZE = false hybrid part = false TSXLDTRK: TSX suspend load addr tracking = false PCONFIG instruction = false CET_IBT: CET indirect branch tracking = false IBRS/IBPB: indirect branch restrictions = false STIBP: 1 thr indirect branch predictor = false L1D_FLUSH: IA32_FLUSH_CMD MSR = false IA32_ARCH_CAPABILITIES MSR = false IA32_CORE_CAPABILITIES MSR = false SSBD: speculative store bypass disable = false Direct Cache Access Parameters (9): PLATFORM_DCA_CAP MSR bits = 0 Architecture Performance Monitoring Features (0xa/eax): version ID = 0x0 (0) number of counters per logical processor = 0x0 (0) bit width of counter = 0x0 (0) length of EBX bit vector = 0x0 (0) Architecture Performance Monitoring Features (0xa/ebx): core cycle event not available = false instruction retired event not available = false reference cycles event not available = false last-level cache ref event not available = false last-level cache miss event not avail = false branch inst retired event not available = false branch mispred retired event not avail = false Architecture Performance Monitoring Features (0xa/edx): number of fixed counters = 0x0 (0) bit width of fixed counters = 0x0 (0) anythread deprecation = false XSAVE features (0xd/0): XCR0 lower 32 bits valid bit field mask = 0x00000007 XCR0 upper 32 bits valid bit field mask = 0x00000000 XCR0 supported: x87 state = true XCR0 supported: SSE state = true XCR0 supported: AVX state = true XCR0 supported: MPX BNDREGS = false XCR0 supported: MPX BNDCSR = false XCR0 supported: AVX-512 opmask = false XCR0 supported: AVX-512 ZMM_Hi256 = false XCR0 supported: AVX-512 Hi16_ZMM = false IA32_XSS supported: PT state = false XCR0 supported: PKRU state = false XCR0 supported: CET_U state = false XCR0 supported: CET_S state = false IA32_XSS supported: HDC state = false bytes required by fields in XCR0 = 0x00000340 (832) bytes required by XSAVE/XRSTOR area = 0x00000340 (832) XSAVE features (0xd/1): XSAVEOPT instruction = true XSAVEC instruction = false XGETBV instruction = false XSAVES/XRSTORS instructions = false SAVE area size in bytes = 0x00000000 (0) IA32_XSS lower 32 bits valid bit field mask = 0x00000000 IA32_XSS upper 32 bits valid bit field mask = 0x00000000 AVX/YMM features (0xd/2): AVX/YMM save state byte size = 0x00000100 (256) AVX/YMM save state byte offset = 0x00000240 (576) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false hypervisor_id = "KVMKVMKVM " hypervisor features (0x40000001/eax): kvmclock available at MSR 0x11 = true delays unnecessary for PIO ops = true mmu_op = false kvmclock available at MSR 0x4b564d00 = true async pf enable available by MSR = true steal clock supported = true guest EOI optimization enabled = true guest spinlock optimization enabled = true guest TLB flush optimization enabled = false async PF VM exit enable available by MSR = false guest send IPI optimization enabled = false host HLT poll disable at MSR 0x4b564d05 = false guest sched yield optimization enabled = false stable: no guest per-cpu warps expected = true hypervisor features (0x40000001/edx): realtime hint: no unbound preemption = true extended feature flags (0x80000001/edx): SYSCALL and SYSRET instructions = true execution disable = true 1-GB large page support = true RDTSCP = true 64-bit extensions technology available = true Intel feature flags (0x80000001/ecx): LAHF/SAHF supported in 64-bit mode = true LZCNT advanced bit manipulation = true 3DNow! PREFETCH/PREFETCHW instructions = true brand = "Intel Core Processor (Broadwell)" L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 data cache information (0x80000005/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L1 instruction cache information (0x80000005/edx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax): instruction # entries = 0x0 (0) instruction associativity = L2 off (0) data # entries = 0x0 (0) data associativity = L2 off (0) L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx): instruction # entries = 0x200 (512) instruction associativity = 4-way (4) data # entries = 0x200 (512) data associativity = 4-way (4) L2 unified cache information (0x80000006/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 16-way (8) size (KB) = 0x200 (512) L3 cache information (0x80000006/edx): line size (bytes) = 0x0 (0) lines per tag = 0x0 (0) associativity = L2 off (0) size (in 512KB units) = 0x0 (0) RAS Capability (0x80000007/ebx): MCA overflow recovery support = false SUCCOR support = false HWA: hardware assert support = false scalable MCA support = false Advanced Power Management Features (0x80000007/ecx): CmpUnitPwrSampleTimeRatio = 0x0 (0) Advanced Power Management Features (0x80000007/edx): TS: temperature sensing diode = false FID: frequency ID control = false VID: voltage ID control = false TTP: thermal trip = false TM: thermal monitor = false STC: software thermal control = false 100 MHz multiplier control = false hardware P-State control = false TscInvariant = false CPB: core performance boost = false read-only effective frequency interface = false processor feedback interface = false APM power reporting = false connected standby = false RAPL: running average power limit = false Physical Address and Linear Address Size (0x80000008/eax): maximum physical address bits = 0x2e (46) maximum linear (virtual) address bits = 0x30 (48) maximum guest physical address bits = 0x0 (0) Extended Feature Extensions ID (0x80000008/ebx): CLZERO instruction = false instructions retired count support = false always save/restore error pointers = false RDPRU instruction = false memory bandwidth enforcement = false WBNOINVD instruction = false IBPB: indirect branch prediction barrier = false IBRS: indirect branch restr speculation = false STIBP: 1 thr indirect branch predictor = false STIBP always on preferred mode = false ppin processor id number supported = false SSBD: speculative store bypass disable = false virtualized SSBD = false SSBD fixed in hardware = false Size Identifiers (0x80000008/ecx): number of CPU cores = 0x1 (1) ApicIdCoreIdSize = 0x0 (0) performance time-stamp counter size = 0x0 (0) Feature Extended Size (0x80000008/edx): RDPRU instruction max input support = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/eax): SvmRev: SVM revision = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/edx): nested paging = false LBR virtualization = false SVM lock = false NRIP save = false MSR based TSC rate control = false VMCB clean bits support = false flush by ASID = false decode assists = false SSSE3/SSE5 opcode set disable = false pause intercept filter = false pause filter threshold = false AVIC: AMD virtual interrupt controller = false virtualized VMLOAD/VMSAVE = false virtualized global interrupt flag (GIF) = false GMET: guest mode execute trap = false guest Spec_ctl support = false NASID: number of address space identifiers = 0x0 (0): (multi-processing synth) = none (multi-processing method) = Intel leaf 1/4 (APIC widths synth): CORE_width=0 SMT_width=0 (APIC synth): PKG_ID=2 CORE_ID=0 SMT_ID=0 (uarch synth) = Intel Broadwell {Haswell}, 14nm (synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm CPU 3: vendor_id = "GenuineIntel" version information (1/eax): processor type = primary processor (0) family = 0x6 (6) model = 0xd (13) stepping id = 0x2 (2) extended family = 0x0 (0) extended model = 0x3 (3) (family synth) = 0x6 (6) (model synth) = 0x3d (61) (simple synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm miscellaneous (1/ebx): process local APIC physical ID = 0x3 (3) maximum IDs for CPUs in pkg = 0x0 (0) CLFLUSH line size = 0x8 (8) brand index = 0x0 (0) brand id = 0x00 (0): unknown feature information (1/edx): x87 FPU on chip = true VME: virtual-8086 mode enhancement = true DE: debugging extensions = true PSE: page size extensions = true TSC: time stamp counter = true RDMSR and WRMSR support = true PAE: physical address extensions = true MCE: machine check exception = true CMPXCHG8B inst. = true APIC on chip = true SYSENTER and SYSEXIT = true MTRR: memory type range registers = true PTE global bit = true MCA: machine check architecture = true CMOV: conditional move/compare instr = true PAT: page attribute table = true PSE-36: page size extension = true PSN: processor serial number = false CLFLUSH instruction = true DS: debug store = false ACPI: thermal monitor and clock ctrl = false MMX Technology = true FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true SS: self snoop = true hyper-threading / multi-core supported = false TM: therm. monitor = false IA64 = false PBE: pending break event = false feature information (1/ecx): PNI/SSE3: Prescott New Instructions = true PCLMULDQ instruction = true DTES64: 64-bit debug store = false MONITOR/MWAIT = false CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = false TM2: thermal monitor 2 = false SSSE3 extensions = true context ID: adaptive or shared L1 data = false SDBG: IA32_DEBUG_INTERFACE = false FMA instruction = true CMPXCHG16B instruction = true xTPR disable = false PDCM: perfmon and debug = false PCID: process context identifiers = true DCA: direct cache access = false SSE4.1 extensions = true SSE4.2 extensions = true x2APIC: extended xAPIC support = true MOVBE instruction = true POPCNT instruction = true time stamp counter deadline = true AES instruction = true XSAVE/XSTOR states = true OS-enabled XSAVE/XSTOR = true AVX: advanced vector extensions = true F16C half-precision convert instruction = true RDRAND instruction = true hypervisor guest status = true cache and TLB information (2): 0x7d: L2 cache: 2M, 8-way, 64 byte lines 0x30: L1 cache: 32K, 8-way, 64 byte lines 0x2c: L1 data cache: 32K, 8-way, 64 byte lines processor serial number = 0003-06D2-0000-0000-0000-0000 deterministic cache parameters (4): --- cache 0 --- cache type = data cache (1) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 1 --- cache type = instruction cache (2) cache level = 0x1 (1) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 2 --- cache type = unified cache (3) cache level = 0x2 (2) self-initializing cache level = true fully associative cache = false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x1000 (4096) WBINVD/INVD acts on lower caches = true inclusive to lower caches = false complex cache indexing = false number of sets (s) = 4096 (size synth) = 4194304 (4 MB) MONITOR/MWAIT (5): smallest monitor-line size (bytes) = 0x0 (0) largest monitor-line size (bytes) = 0x0 (0) enum of Monitor-MWAIT exts supported = true supports intrs as break-event for MWAIT = true number of C0 sub C-states using MWAIT = 0x0 (0) number of C1 sub C-states using MWAIT = 0x0 (0) number of C2 sub C-states using MWAIT = 0x0 (0) number of C3 sub C-states using MWAIT = 0x0 (0) number of C4 sub C-states using MWAIT = 0x0 (0) number of C5 sub C-states using MWAIT = 0x0 (0) number of C6 sub C-states using MWAIT = 0x0 (0) number of C7 sub C-states using MWAIT = 0x0 (0) Thermal and Power Management Features (6): digital thermometer = false Intel Turbo Boost Technology = false ARAT always running APIC timer = false PLN power limit notification = false ECMD extended clock modulation duty = false PTM package thermal management = false HWP base registers = false HWP notification = false HWP activity window = false HWP energy performance preference = false HWP package level request = false HDC base registers = false Intel Turbo Boost Max Technology 3.0 = false HWP capabilities = false HWP PECI override = false flexible HWP = false IA32_HWP_REQUEST MSR fast access mode = false HW_FEEDBACK = false ignoring idle logical processor HWP req = false digital thermometer thresholds = 0x0 (0) hardware coordination feedback = false ACNT2 available = false performance-energy bias capability = false performance capability reporting = false energy efficiency capability reporting = false size of feedback struct (4KB pages) = 0x0 (0) index of CPU's row in feedback struct = 0x0 (0) extended feature flags (7): FSGSBASE instructions = true IA32_TSC_ADJUST MSR supported = false SGX: Software Guard Extensions supported = false BMI1 instructions = true HLE hardware lock elision = true AVX2: advanced vector extensions 2 = true FDP_EXCPTN_ONLY = false SMEP supervisor mode exec protection = true BMI2 instructions = true enhanced REP MOVSB/STOSB = true INVPCID instruction = true RTM: restricted transactional memory = true RDT-CMT/PQoS cache monitoring = false deprecated FPU CS/DS = false MPX: intel memory protection extensions = false RDT-CAT/PQE cache allocation = false AVX512F: AVX-512 foundation instructions = false AVX512DQ: double & quadword instructions = false RDSEED instruction = true ADX instructions = true SMAP: supervisor mode access prevention = true AVX512IFMA: fused multiply add = false PCOMMIT instruction = false CLFLUSHOPT instruction = false CLWB instruction = false Intel processor trace = false AVX512PF: prefetch instructions = false AVX512ER: exponent & reciprocal instrs = false AVX512CD: conflict detection instrs = false SHA instructions = false AVX512BW: byte & word instructions = false AVX512VL: vector length = false PREFETCHWT1 = false AVX512VBMI: vector byte manipulation = false UMIP: user-mode instruction prevention = false PKU protection keys for user-mode = false OSPKE CR4.PKE and RDPKRU/WRPKRU = false WAITPKG instructions = false AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND = false CET_SS: CET shadow stack = false GFNI: Galois Field New Instructions = false VAES instructions = false VPCLMULQDQ instruction = false AVX512_VNNI: neural network instructions = false AVX512_BITALG: bit count/shiffle = false TME: Total Memory Encryption = false AVX512: VPOPCNTDQ instruction = false 5-level paging = false BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0) RDPID: read processor D supported = false CLDEMOTE supports cache line demote = false MOVDIRI instruction = false MOVDIR64B instruction = false ENQCMD instruction = false SGX_LC: SGX launch config supported = false AVX512_4VNNIW: neural network instrs = false AVX512_4FMAPS: multiply acc single prec = false fast short REP MOV = false AVX512_VP2INTERSECT: intersect mask regs = false VERW md-clear microcode support = false SERIALIZE = false hybrid part = false TSXLDTRK: TSX suspend load addr tracking = false PCONFIG instruction = false CET_IBT: CET indirect branch tracking = false IBRS/IBPB: indirect branch restrictions = false STIBP: 1 thr indirect branch predictor = false L1D_FLUSH: IA32_FLUSH_CMD MSR = false IA32_ARCH_CAPABILITIES MSR = false IA32_CORE_CAPABILITIES MSR = false SSBD: speculative store bypass disable = false Direct Cache Access Parameters (9): PLATFORM_DCA_CAP MSR bits = 0 Architecture Performance Monitoring Features (0xa/eax): version ID = 0x0 (0) number of counters per logical processor = 0x0 (0) bit width of counter = 0x0 (0) length of EBX bit vector = 0x0 (0) Architecture Performance Monitoring Features (0xa/ebx): core cycle event not available = false instruction retired event not available = false reference cycles event not available = false last-level cache ref event not available = false last-level cache miss event not avail = false branch inst retired event not available = false branch mispred retired event not avail = false Architecture Performance Monitoring Features (0xa/edx): number of fixed counters = 0x0 (0) bit width of fixed counters = 0x0 (0) anythread deprecation = false XSAVE features (0xd/0): XCR0 lower 32 bits valid bit field mask = 0x00000007 XCR0 upper 32 bits valid bit field mask = 0x00000000 XCR0 supported: x87 state = true XCR0 supported: SSE state = true XCR0 supported: AVX state = true XCR0 supported: MPX BNDREGS = false XCR0 supported: MPX BNDCSR = false XCR0 supported: AVX-512 opmask = false XCR0 supported: AVX-512 ZMM_Hi256 = false XCR0 supported: AVX-512 Hi16_ZMM = false IA32_XSS supported: PT state = false XCR0 supported: PKRU state = false XCR0 supported: CET_U state = false XCR0 supported: CET_S state = false IA32_XSS supported: HDC state = false bytes required by fields in XCR0 = 0x00000340 (832) bytes required by XSAVE/XRSTOR area = 0x00000340 (832) XSAVE features (0xd/1): XSAVEOPT instruction = true XSAVEC instruction = false XGETBV instruction = false XSAVES/XRSTORS instructions = false SAVE area size in bytes = 0x00000000 (0) IA32_XSS lower 32 bits valid bit field mask = 0x00000000 IA32_XSS upper 32 bits valid bit field mask = 0x00000000 AVX/YMM features (0xd/2): AVX/YMM save state byte size = 0x00000100 (256) AVX/YMM save state byte offset = 0x00000240 (576) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false hypervisor_id = "KVMKVMKVM " hypervisor features (0x40000001/eax): kvmclock available at MSR 0x11 = true delays unnecessary for PIO ops = true mmu_op = false kvmclock available at MSR 0x4b564d00 = true async pf enable available by MSR = true steal clock supported = true guest EOI optimization enabled = true guest spinlock optimization enabled = true guest TLB flush optimization enabled = false async PF VM exit enable available by MSR = false guest send IPI optimization enabled = false host HLT poll disable at MSR 0x4b564d05 = false guest sched yield optimization enabled = false stable: no guest per-cpu warps expected = true hypervisor features (0x40000001/edx): realtime hint: no unbound preemption = true extended feature flags (0x80000001/edx): SYSCALL and SYSRET instructions = true execution disable = true 1-GB large page support = true RDTSCP = true 64-bit extensions technology available = true Intel feature flags (0x80000001/ecx): LAHF/SAHF supported in 64-bit mode = true LZCNT advanced bit manipulation = true 3DNow! PREFETCH/PREFETCHW instructions = true brand = "Intel Core Processor (Broadwell)" L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx): instruction # entries = 0xff (255) instruction associativity = 0x1 (1) data # entries = 0xff (255) data associativity = 0x1 (1) L1 data cache information (0x80000005/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L1 instruction cache information (0x80000005/edx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 0x2 (2) size (KB) = 0x40 (64) L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax): instruction # entries = 0x0 (0) instruction associativity = L2 off (0) data # entries = 0x0 (0) data associativity = L2 off (0) L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx): instruction # entries = 0x200 (512) instruction associativity = 4-way (4) data # entries = 0x200 (512) data associativity = 4-way (4) L2 unified cache information (0x80000006/ecx): line size (bytes) = 0x40 (64) lines per tag = 0x1 (1) associativity = 16-way (8) size (KB) = 0x200 (512) L3 cache information (0x80000006/edx): line size (bytes) = 0x0 (0) lines per tag = 0x0 (0) associativity = L2 off (0) size (in 512KB units) = 0x0 (0) RAS Capability (0x80000007/ebx): MCA overflow recovery support = false SUCCOR support = false HWA: hardware assert support = false scalable MCA support = false Advanced Power Management Features (0x80000007/ecx): CmpUnitPwrSampleTimeRatio = 0x0 (0) Advanced Power Management Features (0x80000007/edx): TS: temperature sensing diode = false FID: frequency ID control = false VID: voltage ID control = false TTP: thermal trip = false TM: thermal monitor = false STC: software thermal control = false 100 MHz multiplier control = false hardware P-State control = false TscInvariant = false CPB: core performance boost = false read-only effective frequency interface = false processor feedback interface = false APM power reporting = false connected standby = false RAPL: running average power limit = false Physical Address and Linear Address Size (0x80000008/eax): maximum physical address bits = 0x2e (46) maximum linear (virtual) address bits = 0x30 (48) maximum guest physical address bits = 0x0 (0) Extended Feature Extensions ID (0x80000008/ebx): CLZERO instruction = false instructions retired count support = false always save/restore error pointers = false RDPRU instruction = false memory bandwidth enforcement = false WBNOINVD instruction = false IBPB: indirect branch prediction barrier = false IBRS: indirect branch restr speculation = false STIBP: 1 thr indirect branch predictor = false STIBP always on preferred mode = false ppin processor id number supported = false SSBD: speculative store bypass disable = false virtualized SSBD = false SSBD fixed in hardware = false Size Identifiers (0x80000008/ecx): number of CPU cores = 0x1 (1) ApicIdCoreIdSize = 0x0 (0) performance time-stamp counter size = 0x0 (0) Feature Extended Size (0x80000008/edx): RDPRU instruction max input support = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/eax): SvmRev: SVM revision = 0x0 (0) SVM Secure Virtual Machine (0x8000000a/edx): nested paging = false LBR virtualization = false SVM lock = false NRIP save = false MSR based TSC rate control = false VMCB clean bits support = false flush by ASID = false decode assists = false SSSE3/SSE5 opcode set disable = false pause intercept filter = false pause filter threshold = false AVIC: AMD virtual interrupt controller = false virtualized VMLOAD/VMSAVE = false virtualized global interrupt flag (GIF) = false GMET: guest mode execute trap = false guest Spec_ctl support = false NASID: number of address space identifiers = 0x0 (0): (multi-processing synth) = none (multi-processing method) = Intel leaf 1/4 (APIC widths synth): CORE_width=0 SMT_width=0 (APIC synth): PKG_ID=3 CORE_ID=0 SMT_ID=0 (uarch synth) = Intel Broadwell {Haswell}, 14nm (synth) = Intel Core (unknown type) (Broadwell-U/Y C0) {Haswell}, 14nm
Thanks @tmds, could you also provide the output of the following program? It should allow a basic sanity check against what the underlying hardware reported above:
using System;
using System.Runtime.Intrinsics.Arm;
using System.Runtime.Intrinsics.X86;
using ArmAes = System.Runtime.Intrinsics.Arm.Aes;
using X86Aes = System.Runtime.Intrinsics.X86.Aes;
public class Program
{
public static int Main()
{
Console.WriteLine("Supported x86 ISAs:");
Console.WriteLine($" AES: {X86Aes.IsSupported}");
Console.WriteLine($" AVX: {Avx.IsSupported}");
Console.WriteLine($" AVX2: {Avx2.IsSupported}");
Console.WriteLine($" BMI1: {Bmi1.IsSupported}");
Console.WriteLine($" BMI2: {Bmi2.IsSupported}");
Console.WriteLine($" FMA: {Fma.IsSupported}");
Console.WriteLine($" LZCNT: {Lzcnt.IsSupported}");
Console.WriteLine($" PCLMULQDQ: {Pclmulqdq.IsSupported}");
Console.WriteLine($" POPCNT: {Popcnt.IsSupported}");
Console.WriteLine($" SSE: {Sse.IsSupported}");
Console.WriteLine($" SSE2: {Sse2.IsSupported}");
Console.WriteLine($" SSE3: {Sse3.IsSupported}");
Console.WriteLine($" SSE4.1: {Sse41.IsSupported}");
Console.WriteLine($" SSE4.2: {Sse42.IsSupported}");
Console.WriteLine($" SSSE3: {Ssse3.IsSupported}");
Console.WriteLine($" X86Base: {X86Base.IsSupported}");
Console.WriteLine("Supported x64 ISAs:");
Console.WriteLine($" AES.X64: {X86Aes.X64.IsSupported}");
Console.WriteLine($" AVX.X64: {Avx.X64.IsSupported}");
Console.WriteLine($" AVX2.X64: {Avx2.X64.IsSupported}");
Console.WriteLine($" BMI1.X64: {Bmi1.X64.IsSupported}");
Console.WriteLine($" BMI2.X64: {Bmi2.X64.IsSupported}");
Console.WriteLine($" FMA.X64: {Fma.X64.IsSupported}");
Console.WriteLine($" LZCNT.X64: {Lzcnt.X64.IsSupported}");
Console.WriteLine($" PCLMULQDQ.X64: {Pclmulqdq.X64.IsSupported}");
Console.WriteLine($" POPCNT.X64: {Popcnt.X64.IsSupported}");
Console.WriteLine($" SSE.X64: {Sse.X64.IsSupported}");
Console.WriteLine($" SSE2.X64: {Sse2.X64.IsSupported}");
Console.WriteLine($" SSE3.X64: {Sse3.X64.IsSupported}");
Console.WriteLine($" SSE4.1.X64: {Sse41.X64.IsSupported}");
Console.WriteLine($" SSE4.2.X64: {Sse42.X64.IsSupported}");
Console.WriteLine($" SSSE3.X64: {Ssse3.X64.IsSupported}");
Console.WriteLine($" X86Base.X64: {X86Base.X64.IsSupported}");
Console.WriteLine("Supported Arm ISAs:");
Console.WriteLine($" AdvSimd: {AdvSimd.IsSupported}");
Console.WriteLine($" Aes: {ArmAes.IsSupported}");
Console.WriteLine($" ArmBase: {ArmBase.IsSupported}");
Console.WriteLine($" Crc32: {Crc32.IsSupported}");
Console.WriteLine($" Dp: {Dp.IsSupported}");
Console.WriteLine($" Rdm: {Rdm.IsSupported}");
Console.WriteLine($" Sha1: {Sha1.IsSupported}");
Console.WriteLine($" Sha256: {Sha256.IsSupported}");
Console.WriteLine("Supported Arm64 ISAs:");
Console.WriteLine($" AdvSimd.Arm64: {AdvSimd.Arm64.IsSupported}");
Console.WriteLine($" Aes.Arm64: {ArmAes.Arm64.IsSupported}");
Console.WriteLine($" ArmBase.Arm64: {ArmBase.Arm64.IsSupported}");
Console.WriteLine($" Crc32.Arm64: {Crc32.Arm64.IsSupported}");
Console.WriteLine($" Dp.Arm64: {Dp.Arm64.IsSupported}");
Console.WriteLine($" Rdm.Arm64: {Rdm.Arm64.IsSupported}");
Console.WriteLine($" Sha1.Arm64: {Sha1.Arm64.IsSupported}");
Console.WriteLine($" Sha256.Arm64: {Sha256.Arm64.IsSupported}");
}
}
JIT\Regression\JITBlue\Runtime_34587
should also provide the same output and JIT\HardwareIntrinsics\X86\X86Base\CpuId
should be validating the correctness of IsSupported
checks as compared to uncached CPUID checks (accounting for environment variables that can force an IsSupported
check to return false
)
The CPU looks to support everything and so I would expect everything to report true
, same as on our other machines.
/cc @dseefeld @crummel @dagood
@tannergooding
could you also provide the output of the following program? It should allow a basic sanity check against what the underlying hardware reported above:
I had to comment out a few things in order for it to compile using preview8 sdk:
using System;
// using System.Runtime.Intrinsics.Arm;
using System.Runtime.Intrinsics.X86;
// using ArmAes = System.Runtime.Intrinsics.Arm.Aes;
using X86Aes = System.Runtime.Intrinsics.X86.Aes;
public class Program
{
public static int Main()
{
Console.WriteLine("Supported x86 ISAs:");
Console.WriteLine($" AES: {X86Aes.IsSupported}");
Console.WriteLine($" AVX: {Avx.IsSupported}");
Console.WriteLine($" AVX2: {Avx2.IsSupported}");
Console.WriteLine($" BMI1: {Bmi1.IsSupported}");
Console.WriteLine($" BMI2: {Bmi2.IsSupported}");
Console.WriteLine($" FMA: {Fma.IsSupported}");
Console.WriteLine($" LZCNT: {Lzcnt.IsSupported}");
Console.WriteLine($" PCLMULQDQ: {Pclmulqdq.IsSupported}");
Console.WriteLine($" POPCNT: {Popcnt.IsSupported}");
Console.WriteLine($" SSE: {Sse.IsSupported}");
Console.WriteLine($" SSE2: {Sse2.IsSupported}");
Console.WriteLine($" SSE3: {Sse3.IsSupported}");
Console.WriteLine($" SSE4.1: {Sse41.IsSupported}");
Console.WriteLine($" SSE4.2: {Sse42.IsSupported}");
Console.WriteLine($" SSSE3: {Ssse3.IsSupported}");
// Console.WriteLine($" X86Base: {X86Base.IsSupported}");
Console.WriteLine("Supported x64 ISAs:");
Console.WriteLine($" AES.X64: {X86Aes.X64.IsSupported}");
Console.WriteLine($" AVX.X64: {Avx.X64.IsSupported}");
Console.WriteLine($" AVX2.X64: {Avx2.X64.IsSupported}");
Console.WriteLine($" BMI1.X64: {Bmi1.X64.IsSupported}");
Console.WriteLine($" BMI2.X64: {Bmi2.X64.IsSupported}");
Console.WriteLine($" FMA.X64: {Fma.X64.IsSupported}");
Console.WriteLine($" LZCNT.X64: {Lzcnt.X64.IsSupported}");
Console.WriteLine($" PCLMULQDQ.X64: {Pclmulqdq.X64.IsSupported}");
Console.WriteLine($" POPCNT.X64: {Popcnt.X64.IsSupported}");
Console.WriteLine($" SSE.X64: {Sse.X64.IsSupported}");
Console.WriteLine($" SSE2.X64: {Sse2.X64.IsSupported}");
Console.WriteLine($" SSE3.X64: {Sse3.X64.IsSupported}");
Console.WriteLine($" SSE4.1.X64: {Sse41.X64.IsSupported}");
Console.WriteLine($" SSE4.2.X64: {Sse42.X64.IsSupported}");
Console.WriteLine($" SSSE3.X64: {Ssse3.X64.IsSupported}");
// Console.WriteLine($" X86Base.X64: {X86Base.X64.IsSupported}");
// Console.WriteLine("Supported Arm ISAs:");
// Console.WriteLine($" AdvSimd: {AdvSimd.IsSupported}");
// Console.WriteLine($" Aes: {ArmAes.IsSupported}");
// Console.WriteLine($" ArmBase: {ArmBase.IsSupported}");
// Console.WriteLine($" Crc32: {Crc32.IsSupported}");
// Console.WriteLine($" Dp: {Dp.IsSupported}");
// Console.WriteLine($" Rdm: {Rdm.IsSupported}");
// Console.WriteLine($" Sha1: {Sha1.IsSupported}");
// Console.WriteLine($" Sha256: {Sha256.IsSupported}");
// Console.WriteLine("Supported Arm64 ISAs:");
// Console.WriteLine($" AdvSimd.Arm64: {AdvSimd.Arm64.IsSupported}");
// Console.WriteLine($" Aes.Arm64: {ArmAes.Arm64.IsSupported}");
// Console.WriteLine($" ArmBase.Arm64: {ArmBase.Arm64.IsSupported}");
// Console.WriteLine($" Crc32.Arm64: {Crc32.Arm64.IsSupported}");
// Console.WriteLine($" Dp.Arm64: {Dp.Arm64.IsSupported}");
// Console.WriteLine($" Rdm.Arm64: {Rdm.Arm64.IsSupported}");
// Console.WriteLine($" Sha1.Arm64: {Sha1.Arm64.IsSupported}");
// Console.WriteLine($" Sha256.Arm64: {Sha256.Arm64.IsSupported}");
return 0;
}
}
On the CI VM this outputs:
Supported x86 ISAs:
AES: True
AVX: True
AVX2: True
BMI1: True
BMI2: True
FMA: True
LZCNT: True
PCLMULQDQ: True
POPCNT: True
SSE: True
SSE2: True
SSE3: True
SSE4.1: True
SSE4.2: True
SSSE3: True
Supported x64 ISAs:
AES.X64: True
AVX.X64: True
AVX2.X64: True
BMI1.X64: True
BMI2.X64: True
FMA.X64: True
LZCNT.X64: True
PCLMULQDQ.X64: True
POPCNT.X64: True
SSE.X64: True
SSE2.X64: True
SSE3.X64: True
SSE4.1.X64: True
SSE4.2.X64: True
SSSE3.X64: True
All True
as you expected.
@tannergooding I think this may have regressed in #40167.
It looks like this is the case: 5c29e1483e0ca803c8be3ea6a0a8cfe899b2a813 passes on the CI machine, and 96f178d32b7ba62485917ac46ef1edcfd3c2d10d fails.
@tannergooding I'm not debugging this further atm. If you want me to run some code on our CI machine, you can put it in a git repo and compile it up-front, like I did here https://github.com/tmds/debug_app/.
I had to comment out a few things in order for it to compile using preview8 sdk
Was it compiled with preview 8 but run with a build that contains 96f178d? The change that 96f178d introduces is the x86Base
class and tweaks the CPUID logic in the VM layer to use the new shared function.
It looks like this is the case: 5c29e14 passes on the CI machine, and 96f178d fails.
There was a separate fix that is also needed ontop of 96f178d: https://github.com/dotnet/runtime/pull/40615
If you want me to run some code on our CI machine
I'm going to try and setup a Fedora32 machine locally tomorrow and see if I can get a repro. Are there any special things the CI machines have that might impact my ability to repro?
Was it compiled with preview 8 but run with a build that contains 96f178d?
Yes.
I'm going to try and setup a Fedora32 machine locally tomorrow and see if I can get a repro. Are there any special things the CI machines have that might impact my ability to repro?
I think you won't have much luck. It doesn't repro on my Fedora 32 development machine. It repros every time on our CI server, which builds dotnet/runtime in a VM. It fails the same way whether the distro is Fedora 32, RHEL7 or RHEL8.
Is there anything unique about the CI machines that might cause a failure on them but not on an "equivalent" local machine?
CC. @CarolEidt, @echesakovMSFT, what do you think the best move forward here would be? Should we try to get a JitDump?
Is there anything unique about the CI machines that might cause a failure on them but not on an "equivalent" local machine?
We use vagrant+libvirt. I'm going to try if changing the cpumodel to host-passthrough
makes a difference.
Maybe we can narrow down the root cause looking at what changed in https://github.com/dotnet/runtime/pull/40167?
Maybe we can narrow down the root cause looking at what changed in #40167?
I've looked through the code a few times and nothing obvious pops out. The PR does the following:
System.Runtime.Intrinsics.X86.X86Base
classX86Base
__cpuidex
function, so it can also be used by X86Base.CpuId
There were no changes to the JIT itself to support this, just the VM. All of the relevant VM code currently runs as part of startup. The worst case scenario should be that one of the CPUID checks was no longer correct, of which CI already caught and fixed one in #40615. We now have a regression test for future similar scenarios. I've also triple checked that the CPUID checks are querying the correct function, register, and bit (although someone else could likewise do the same to ensure I'm just not overlooking something).
So, as best I can tell, there were no changes which should be causing the GenericVectorTests to fail comparisons or to cause a segmentation fault by accessing invalid memory. Anything causing that should be caused by a separate PR/issue (possibly pre-existing).
what do you think the best move forward here would be? Should we try to get a JitDump?
I'm not sure a JitDump would help without a better idea of which method(s) is/are causing the failures. It seems like it will be quite tricky to track down without the ability to reproduce and debug it, though given that it eventually fails with a SIGSEGV
perhaps a dump would help identify some possible candidates.
We use vagrant+libvirt. I'm going to try if changing the cpumodel to host-passthrough makes a difference.
No difference.
given that it eventually fails with a SIGSEGV perhaps a dump would help identify some possible candidates.
I looked at two coredumps. In both the segmentation fault occurs when GC gets triggered by an allocation:
first dump:
(lldb) clrstack -f
OS Thread Id: 0x17b61 (1)
Child SP IP Call Site
00007FB608D426C0 00007FB6B3E4CA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007FB608D426C0 00007FB6B3E4CA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007FB608D426C0 00007FB6B3E4CA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007FB608D429E0 00007FB6B3E48732 libcoreclr.so!WKS::gc_heap::gc1() + 914 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:16698
00007FB608D42A70 00007FB6B3E526FE libcoreclr.so!WKS::gc_heap::garbage_collect(int) + 2190 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:18282
00007FB608D42AE0 00007FB6B3E43E2A libcoreclr.so!WKS::GCHeap::GarbageCollectGeneration(unsigned int, gc_reason) + 1018 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:37755
00007FB608D42B30 00007FB6B3E45FC7 libcoreclr.so!WKS::gc_heap::try_allocate_more_space(alloc_context*, unsigned long, unsigned int, int) + 743 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:13993
00007FB608D42B80 00007FB6B3E6A940 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 160 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:14490
00007FB608D42B80 00007FB6B3E6A926 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 134 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:14513
00007FB608D42B80 00007FB6B3E6A908 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 104 at /home/tester/runtime/src/coreclr/src/gc/gcpriv.h:868
00007FB608D42BC0 00007FB6B3D13EDF libcoreclr.so!AllocateSzArray(MethodTable*, int, GC_ALLOC_FLAGS) + 319 at /home/tester/runtime/src/coreclr/src/vm/gchelpers.cpp:239
00007FB608D42BC0 00007FB6B3D13E75 libcoreclr.so!AllocateSzArray(MethodTable*, int, GC_ALLOC_FLAGS) + 213 at /home/tester/runtime/src/coreclr/src/vm/eeconfig.h:404
00007FB608D42C20 00007FB6B3D31D7F libcoreclr.so!JIT_NewArr1(CORINFO_CLASS_STRUCT_*, long) + 175 at /home/tester/runtime/src/coreclr/src/vm/jithelpers.cpp:0
00007FB608D42C48 [HelperMethodFrame: 00007fb608d42c48]
00007FB608D42D90 00007FB63A0C0526 System.Private.CoreLib.dll!System.Text.StringBuilder..ctor() + 54 [/home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/Text/StringBuilder.cs @ 77]
00007FB608D42DB0 00007FB63AA9CC5F System.Private.CoreLib.dll!System.Numerics.Vector3.ToString(System.String, System.IFormatProvider) + 79 [/home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/Vector3.cs @ 103]
00007FB608D42E10 00007FB63A269AD8 System.Private.CoreLib.dll!System.Text.ValueStringBuilder.AppendFormatHelper(System.IFormatProvider, System.String, System.ParamsArray) + 1400 [/home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/Text/ValueStringBuilder.AppendFormat.cs @ 322]
00007FB608D42EB0 00007FB63A269476 System.Private.CoreLib.dll!System.String.FormatHelper(System.IFormatProvider, System.String, System.ParamsArray) + 390 [/home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs @ 528]
00007FB608D43160 00007FB63A2691A5 System.Private.CoreLib.dll!System.String.Format(System.String, System.Object, System.Object) + 133
00007FB608D431D0 00007FB63AAA80CB System.Numerics.Vectors.Tests.dll!System.Numerics.Tests.Matrix4x4Tests.DecomposeTest(Single, Single, Single, System.Numerics.Vector3, System.Numerics.Vector3) + 1307
00007FB608D43530 00007FB63AAA680C System.Numerics.Vectors.Tests.dll!System.Numerics.Tests.Matrix4x4Tests.Matrix4x4DecomposeTest01() + 316
...
second dump:
(lldb) clrstack -f
OS Thread Id: 0x17863 (1)
Child SP IP Call Site
00007F8C827F7D50 00007F8D42D4AA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007F8C827F7D50 00007F8D42D4AA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007F8C827F7D50 00007F8D42D4AA6B libcoreclr.so!WKS::gc_heap::plan_phase(int) + 4635 at /home/tester/runtime/src/coreclr/src/vm/methodtable.h:1664
00007F8C827F8070 00007F8D42D46732 libcoreclr.so!WKS::gc_heap::gc1() + 914 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:16698
00007F8C827F8100 00007F8D42D506FE libcoreclr.so!WKS::gc_heap::garbage_collect(int) + 2190 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:18282
00007F8C827F8170 00007F8D42D41E2A libcoreclr.so!WKS::GCHeap::GarbageCollectGeneration(unsigned int, gc_reason) + 1018 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:37755
00007F8C827F81C0 00007F8D42D43FC7 libcoreclr.so!WKS::gc_heap::try_allocate_more_space(alloc_context*, unsigned long, unsigned int, int) + 743 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:13993
00007F8C827F8210 00007F8D42D68940 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 160 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:14490
00007F8C827F8210 00007F8D42D68926 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 134 at /home/tester/runtime/src/coreclr/src/gc/gc.cpp:14513
00007F8C827F8210 00007F8D42D68908 libcoreclr.so!WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) + 104 at /home/tester/runtime/src/coreclr/src/gc/gcpriv.h:868
00007F8C827F8250 00007F8D42C11EDF libcoreclr.so!AllocateSzArray(MethodTable*, int, GC_ALLOC_FLAGS) + 319 at /home/tester/runtime/src/coreclr/src/vm/gchelpers.cpp:239
00007F8C827F8250 00007F8D42C11E75 libcoreclr.so!AllocateSzArray(MethodTable*, int, GC_ALLOC_FLAGS) + 213 at /home/tester/runtime/src/coreclr/src/vm/eeconfig.h:404
00007F8C827F82B0 00007F8D42C2FD7F libcoreclr.so!JIT_NewArr1(CORINFO_CLASS_STRUCT_*, long) + 175 at /home/tester/runtime/src/coreclr/src/vm/jithelpers.cpp:0
00007F8C827F82D8 [HelperMethodFrame: 00007f8c827f82d8]
00007F8C827F8420 00007F8CC9AA6AAF System.Numerics.Vectors.Tests.dll!System.Numerics.Tests.QuaternionTests.QuaternionCreateFromYawPitchRollTest2() + 703
...
Just guessing: maybe some registers used by vectorized operations cause issues with GC, or vice versa?
These are registers for the second stack trace:
(lldb) register read -a
General Purpose Registers:
rax = 0x00007f8c9c333058
rbx = 0x0000000000000001
rcx = 0xc5fb88e8f1e002ec
rdx = 0x0000000089717434
rdi = 0x0000000090000001
rsi = 0x0000000000000001
rbp = 0x00007f8c827f8060
rsp = 0x00007f8c827f7d50
r8 = 0x00007f8d4001b098
r9 = 0x00007f8d4001b010
r10 = 0x00007f8c9c403230
r11 = 0x0000000000000000
r12 = 0x00007f8c9bffe000
r13 = 0x00007f8c9c333058
r14 = 0x00007f8c9c333058
r15 = 0x00007f8d430d3a10
rip = 0x00007f8d42d4aa6b libcoreclr.so`WKS::gc_heap::plan_phase(int) + 4635 [inlined] MethodTable::GetBaseSize() at gc.cpp:9544
libcoreclr.so`WKS::gc_heap::plan_phase(int) + 4635 [inlined] WKS::my_get_size(Object*) at gc.cpp:23002
libcoreclr.so`WKS::gc_heap::plan_phase(int) + 4635 at gc.cpp:23002
rflags = 0x0000000000010282
cs = 0x0000000000000033
fs = 0x0000000000000000
gs = 0x0000000000000000
ss = 0x000000000000002b
ds = 0x0000000000000000
es = 0x0000000000000000
eax = 0x9c333058
ebx = 0x00000001
ecx = 0xf1e002ec
edx = 0x89717434
edi = 0x90000001
esi = 0x00000001
ebp = 0x827f8060
esp = 0x827f7d50
r8d = 0x4001b098
r9d = 0x4001b010
r10d = 0x9c403230
r11d = 0x00000000
r12d = 0x9bffe000
r13d = 0x9c333058
r14d = 0x9c333058
r15d = 0x430d3a10
ax = 0x3058
bx = 0x0001
cx = 0x02ec
dx = 0x7434
di = 0x0001
si = 0x0001
bp = 0x8060
sp = 0x7d50
r8w = 0xb098
r9w = 0xb010
r10w = 0x3230
r11w = 0x0000
r12w = 0xe000
r13w = 0x3058
r14w = 0x3058
r15w = 0x3a10
ah = 0x30
bh = 0x00
ch = 0x02
dh = 0x74
al = 0x58
bl = 0x01
cl = 0xec
dl = 0x34
dil = 0x01
sil = 0x01
bpl = 0x60
spl = 0x50
r8l = 0x98
r9l = 0x10
r10l = 0x30
r11l = 0x00
r12l = 0x00
r13l = 0x58
r14l = 0x58
r15l = 0x10
Floating Point Registers:
fctrl = 0x037f
fstat = 0x0000
ftag = 0x0000
fop = 0x0000
fiseg = 0x00000000
fioff = 0x439c4043
foseg = 0x00000000
fooff = 0x85594b20
mxcsr = 0x00001fa7
mxcsrmask = 0x0000ffff
st0 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
st1 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
st2 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
st3 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
st4 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
st5 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xb0 0x02 0x40}
st6 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xd0 0x02 0x40}
st7 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xd0 0x02 0x40}
mm0 = 0x0000000000000000
mm1 = 0x0000000000000000
mm2 = 0x0000000000000000
mm3 = 0x0000000000000000
mm4 = 0x0000000000000000
mm5 = 0xb000000000000000
mm6 = 0xd000000000000000
mm7 = 0xd000000000000000
xmm0 = {0x18 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm1 = {0x00 0x00 0x00 0x00 0x00 0x00 0x30 0x43 0x00 0x00 0x00 0x00 0x00 0x00 0x30 0x45}
xmm2 = {0xb8 0x1e 0x85 0xeb 0x51 0xb8 0xae 0x3f 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm3 = {0x00 0x00 0x00 0x00 0x00 0x00 0xe0 0x43 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm4 = {0xc3 0xff 0xff 0xff 0xff 0xff 0xdf 0xc3 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm5 = {0xb8 0x1e 0x85 0xeb 0x51 0xb8 0xee 0x40 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm6 = {0x48 0x7a 0x7f 0x82 0x8c 0x7f 0x00 0x00 0xc0 0xb8 0x7f 0x82 0x8c 0x7f 0x00 0x00}
xmm7 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm8 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm9 = {0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm10 = {0xfc 0xfc 0xfc 0xfc 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm11 = {0x03 0x03 0x03 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm12 = {0x07 0x07 0x07 0x07 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
xmm13 = {0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00}
xmm14 = {0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff}
xmm15 = {0x24 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x24 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
In both the segmentation fault occurs when GC gets triggered by an allocation:
Cc @Maoni0 (although this may well not be a Gc issue)
@tmds is it possible to share several dumps with us privately if you didn’t already?
Cc @jeffhandley i have marked this blocking-release as we can’t crash on RedHat CI and we need to understand this and whether it could be widespread in order to ship, I think.
You can find a coredump here.
To make it work with lldb
I need to extract the testhost.tar.gz
content to the exact same location it lives on the CI machine: in /home/tester/runtime
.
Given that there are tests that report failure prior to the crash in the GC, it seems likely to be a data corruption issue. While the GC is in the stack trace, the crash is happening in MethodTable::GetBaseSize()
, which would also seem to point to data corruption.
@CarolEidt agreed. Do you have cycles to look at the dump, or @tannergooding will you?
@danmosemsft - I've taken a look at the changes in #40167, and found what seems to be a bug. @tannergooding has a fix for that in #42089 and we can see if that fixes it. It might explain an issue in MethodTable::GetBaseSize()
, as the issue is in one of the paths for determining AVX support, which impacts the size of the Vector<T>
type.
Oh - good! Perhaps @tmds can try out https://github.com/dotnet/runtime/pull/42089 for us.
Thanks! Our CI machine is happy now! :tada: :tada:
Thanks @CarolEidt
Yes, Thanks @CarolEidt!
I completely missed that one of the cpuInfo
was not fixed in https://github.com/dotnet/runtime/pull/40615, a second pair of eyes always helps 🎉
Closing since everything's merged. @CarolEidt please reopen if more work is actually required.
We do a daily build+test run of dotnet/runtime on Fedora 32 and RHEL8. Since Aug 6th, System.Numerics.Vectors.Tests is reporting test failures and crashes with SIGSEGV.
CI output:
When I run on my development machine, the tests pass without crashing.
cc @omajid @RheaAyase