TritonDataCenter / smartos-live

For more information, please see http://smartos.org/ For any questions that aren't answered there, please join the SmartOS discussion list: https://smartos.topicbox.com/groups/smartos-discuss
1.57k stars 246 forks source link

Hyperthreading smt_enabled not working. XEON E5-2680v2. #849

Open StevenWilliams opened 5 years ago

StevenWilliams commented 5 years ago

So I have a Dell PowerEdge C6220 II with 2 x Xeon E5 2680v2 (20 cores total, 40 threads). I have enabled Hyperthreading in the BIOS and it works in Linux. However it does not seem to be working in SmartOS. smt_enabled=true is set in the config, but it seems like that option is being ignored. I've been told in the IRC that you guys might have set it to disable hyperthreading by default, is there a way to enable it again?

Boot: Loading unix... Loading /platform/i86pc/amd64/boot_archive..., /platform/i86pc/amd64/boot_archive.hash... Booting... Sun OS Release 5.11 Version joyent_20190912T024836Z 64-Bit.Notice: System detected 40 cpus but only 20 cpu(s) were enabled during boot. Notice: Use "boot-ncpus" parameter to enable more CPU(s).

psradm -naF - had no effect.

[root@dime-dallas1 /var]# prtpicl / (picl, 122100000001) platform (upa, 122100000005) scsi_vhci (devctl, 122100000070) disk (fabric, 12210000008c) multipath (multipath, 1221000000c2) disk (fabric, 1221000000d1) multipath (multipath, 122100000107) pci (pciex, 122100000120) pci1028,518 (obp-device, 122100000134) pci8086,e02 (pciex, 122100000156) pci1028,518 (obp-device, 12210000017e) pci1028,518 (obp-device, 1221000001b0) pci8086,e04 (pciex, 1221000001e2) pci1000,3040 (obp-device, 12210000020b) iport (devctl, 122100000250) smp (smp, 12210000026c) enclosure (scsi, 12210000027d) iport (devctl, 122100000297) pci1028,518 (obp-device, 1221000002ad) pci1028,518 (obp-device, 1221000002d1) pci1028,518 (obp-device, 1221000002f5) pci1028,518 (obp-device, 122100000319) pci1028,518 (obp-device, 12210000033d) pci1028,518 (obp-device, 122100000361) pci1028,518 (obp-device, 122100000385) pci1028,518 (obp-device, 1221000003a9) pci1028,518 (obp-device, 1221000003cd) pci1028,518 (obp-device, 1221000003ef) pci1028,518 (obp-device, 122100000411) pci1028,518 (obp-device, 122100000433) pci8086,1d3e (pciex, 122100000456) pci1028,518 (obp-device, 122100000476) pci1028,518 (obp-device, 122100000496) pci1028,518 (obp-device, 1221000004b6) hub (obp-device, 1221000004e1) hub (obp-device, 12210000050d) device (obp-device, 12210000053c) keyboard (obp-device, 12210000055e) mouse (obp-device, 12210000057b) mouse (obp-device, 122100000598) pci8086,1d10 (pciex, 1221000005b5) pci1a03,1150 (pci, 1221000005dd) display (display, 122100000605) pci1028,518 (obp-device, 12210000062c) hub (obp-device, 122100000657) storage (obp-device, 122100000683) disk (block, 1221000006b1) pci8086,244e (pci, 1221000006e4) isa (isa, 122100000703) asy (serial, 122100000727) i8042 (obp-device, 122100000735) pci1028,518 (obp-device, 12210000073f) pci1028,518 (obp-device, 122100000771) pci (pciex, 1221000007aa) pci1028,518 (obp-device, 1221000007bc) pci1028,518 (obp-device, 1221000007da) pci1028,518 (obp-device, 1221000007f8) pci1028,518 (obp-device, 122100000816) pci1028,518 (obp-device, 122100000834) pci1028,518 (obp-device, 122100000852) pci1028,518 (obp-device, 122100000870) pci1028,518 (obp-device, 122100000892) pci1028,518 (obp-device, 1221000008b4) pci1028,518 (obp-device, 1221000008d2) pci1028,518 (obp-device, 1221000008f0) pci1028,518 (obp-device, 12210000090e) pci1028,518 (obp-device, 12210000092c) pci1028,518 (obp-device, 12210000094a) pci1028,518 (obp-device, 122100000968) pci1028,518 (obp-device, 122100000986) pci1028,518 (obp-device, 1221000009a4) pci1028,518 (obp-device, 1221000009c2) pci1028,518 (obp-device, 1221000009e0) pci1028,518 (obp-device, 122100000a02) pci1028,518 (obp-device, 122100000a20) pci1028,518 (obp-device, 122100000a46) pci1028,518 (obp-device, 122100000a6c) pci1028,518 (obp-device, 122100000a92) pci1028,518 (obp-device, 122100000ab8) pci1028,518 (obp-device, 122100000ade) pci1028,518 (obp-device, 122100000b04) pci1028,518 (obp-device, 122100000b26) pci1028,518 (obp-device, 122100000b48) pci1028,518 (obp-device, 122100000b6a) pci1028,518 (obp-device, 122100000b8c) pci1028,518 (obp-device, 122100000bae) pci1028,518 (obp-device, 122100000bd0) pci1028,518 (obp-device, 122100000bf2) pci1028,518 (obp-device, 122100000c14) pci1028,518 (obp-device, 122100000c32) pci1028,518 (obp-device, 122100000c50) pci1028,518 (obp-device, 122100000c6e) pci1028,518 (obp-device, 122100000c8c) pci1028,518 (obp-device, 122100000cae) pci1028,518 (obp-device, 122100000cd0) pci (pciex, 122100000cf2) pci8086,e04 (pciex, 122100000d06) pci8086,e06 (pciex, 122100000d27) pci1170,4c (obp-device, 122100000d50) pci1170,4c (obp-device, 122100000d80) pci1028,518 (obp-device, 122100000db0) pci1028,518 (obp-device, 122100000dd4) pci1028,518 (obp-device, 122100000df8) pci1028,518 (obp-device, 122100000e1c) pci1028,518 (obp-device, 122100000e40) pci1028,518 (obp-device, 122100000e64) pci1028,518 (obp-device, 122100000e88) pci1028,518 (obp-device, 122100000eac) pci1028,518 (obp-device, 122100000ed0) pci1028,518 (obp-device, 122100000ef2) pci1028,518 (obp-device, 122100000f14) pci1028,518 (obp-device, 122100000f36) pci (pciex, 122100000f59) pci1028,518 (obp-device, 122100000f6b) pci1028,518 (obp-device, 122100000f89) pci1028,518 (obp-device, 122100000fa7) pci1028,518 (obp-device, 122100000fc5) pci1028,518 (obp-device, 122100000fe3) pci1028,518 (obp-device, 122100001001) pci1028,518 (obp-device, 12210000101f) pci1028,518 (obp-device, 122100001041) pci1028,518 (obp-device, 122100001063) pci1028,518 (obp-device, 122100001081) pci1028,518 (obp-device, 12210000109f) pci1028,518 (obp-device, 1221000010bd) pci1028,518 (obp-device, 1221000010db) pci1028,518 (obp-device, 1221000010f9) pci1028,518 (obp-device, 122100001117) pci1028,518 (obp-device, 122100001135) pci1028,518 (obp-device, 122100001153) pci1028,518 (obp-device, 122100001171) pci1028,518 (obp-device, 12210000118f) pci1028,518 (obp-device, 1221000011b1) pci1028,518 (obp-device, 1221000011cf) pci1028,518 (obp-device, 1221000011f5) pci1028,518 (obp-device, 12210000121b) pci1028,518 (obp-device, 122100001241) pci1028,518 (obp-device, 122100001267) pci1028,518 (obp-device, 12210000128d) pci1028,518 (obp-device, 1221000012b3) pci1028,518 (obp-device, 1221000012d5) pci1028,518 (obp-device, 1221000012f7) pci1028,518 (obp-device, 122100001319) pci1028,518 (obp-device, 12210000133b) pci1028,518 (obp-device, 12210000135d) pci1028,518 (obp-device, 12210000137f) pci1028,518 (obp-device, 1221000013a1) pci1028,518 (obp-device, 1221000013c3) pci1028,518 (obp-device, 1221000013e1) pci1028,518 (obp-device, 1221000013ff) pci1028,518 (obp-device, 12210000141d) pci1028,518 (obp-device, 12210000143b) pci1028,518 (obp-device, 12210000145d) pci1028,518 (obp-device, 12210000147f) pseudo (devctl, 1221000014cd) zconsnex (devctl, 1221000014d7) zfdnex (devctl, 1221000014e1) power (power_button, 1221000014eb) ppm (ppm, 1221000014f5) stmf (admin, 122100001585) stmf_sbd (admin, 12210000158d) coretemp (chip0.core0, 122100001596) imc (mc-imc-0, 12210000159e) xsvc (obp-device, 1221000015a6) cpus (cpus, 1221000015b1) cpu (cpu, 1221000015b9) cpu (cpu, 1221000015d8) cpu (cpu, 1221000015f7) cpu (cpu, 122100001616) cpu (cpu, 122100001635) cpu (cpu, 122100001654) cpu (cpu, 122100001673) cpu (cpu, 122100001692) cpu (cpu, 1221000016b1) cpu (cpu, 1221000016d0) cpu (cpu, 1221000016ef) cpu (cpu, 12210000170e) cpu (cpu, 12210000172d) cpu (cpu, 12210000174c) cpu (cpu, 12210000176b) cpu (cpu, 12210000178a) cpu (cpu, 1221000017a9) cpu (cpu, 1221000017c8) cpu (cpu, 1221000017e7) cpu (cpu, 122100001806) obp (picl, 12210000006d) ramdisk (ramdisk, 122100000116) ioapics (ioapics, 122100000792) ioapic (ioapic, 122100000798) ioapic (ioapic, 1221000007a1) used-resources (used-resources, 1221000014a1) iscsi (iscsi, 1221000014aa) options (options, 1221000014c3) [root@dime-dallas1 /var]# mdb -ke 'boot_ncpus/D; ncpus/D' boot_ncpus: boot_ncpus: 40
ncpus: ncpus: 20
[root@dime-dallas1 /var]# mdb -ke 'boot_max_ncpus/D' boot_max_ncpus: boot_max_ncpus: 40
[root@dime-dallas1 /var]# mdb -ke smt_boot_disable/D smt_boot_disable: smt_boot_disable: 0
[root@dime-dallas1 /var]# mdb -ke smt_enabled/D smt_enabled: smt_enabled: 0
[root@dime-dallas1 /var]# psrinfo 0 on-line since 09/24/2019 23:59:02 1 on-line since 09/24/2019 23:59:04 2 on-line since 09/24/2019 23:59:04 3 on-line since 09/24/2019 23:59:04 4 on-line since 09/24/2019 23:59:04 5 on-line since 09/24/2019 23:59:04 6 on-line since 09/24/2019 23:59:04 7 on-line since 09/24/2019 23:59:04 8 on-line since 09/24/2019 23:59:04 9 on-line since 09/24/2019 23:59:04 10 on-line since 09/24/2019 23:59:04 11 on-line since 09/24/2019 23:59:04 12 on-line since 09/24/2019 23:59:05 13 on-line since 09/24/2019 23:59:05 14 on-line since 09/24/2019 23:59:05 15 on-line since 09/24/2019 23:59:05 16 on-line since 09/24/2019 23:59:05 17 on-line since 09/24/2019 23:59:05 18 on-line since 09/24/2019 23:59:05 19 on-line since 09/24/2019 23:59:05 [root@dime-dallas1 /var]# psrinfo -v Status of virtual processor 0 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:02. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 1 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 2 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 3 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 4 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 5 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 6 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 7 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 8 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 9 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 10 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 11 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:04. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 12 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 13 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 14 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 15 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 16 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 17 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 18 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor. Status of virtual processor 19 as of: 09/25/2019 00:15:07 on-line since 09/24/2019 23:59:05. The i386 processor operates at 2800 MHz, and has an i387 compatible floating point processor

timfoster commented 5 years ago

I'm sorry I don't know the answer to your question, someone else might chime in.

In the meantime, a useful data-point might be that even when we have smt_enabled=false in the config, the system should still see that there are hyper-threaded processors available, they'll just appear as 'disabled':

[root@kura ~]# grep smt /usbkey/config
smt_enabled=false
[root@kura ~]# psrinfo
0       on-line   since 09/25/2019 11:32:01
1       on-line   since 09/25/2019 11:32:03
2       on-line   since 09/25/2019 11:32:03
3       on-line   since 09/25/2019 11:32:03
4       disabled  since 09/25/2019 11:32:15
5       disabled  since 09/25/2019 11:32:15
6       disabled  since 09/25/2019 11:32:15
7       disabled  since 09/25/2019 11:32:15
[root@kura ~]#

enabling them:

[root@kura ~]# psradm -naF
[root@kura ~]# psrinfo
0       on-line   since 09/25/2019 11:32:01
1       on-line   since 09/25/2019 11:32:03
2       on-line   since 09/25/2019 11:32:03
3       on-line   since 09/25/2019 11:32:03
4       on-line   since 09/25/2019 11:33:01
5       on-line   since 09/25/2019 11:33:01
6       on-line   since 09/25/2019 11:33:01
7       on-line   since 09/25/2019 11:33:01
[root@kura ~]#

attempting to disable again:

[root@kura ~]# psradm -aS
Failed to disable simultaneous multi-threading: Device busy
[root@kura ~]#

(I wait a bit, then try again)

[root@kura ~]# psradm -aS
[root@kura ~]# psrinfo
0       on-line   since 09/25/2019 11:32:01
1       on-line   since 09/25/2019 11:32:03
2       on-line   since 09/25/2019 11:32:03
3       on-line   since 09/25/2019 11:32:03
4       disabled  since 09/25/2019 11:33:17
5       disabled  since 09/25/2019 11:33:19
6       disabled  since 09/25/2019 11:33:19
7       disabled  since 09/25/2019 11:33:19
[root@kura ~]#
StevenWilliams commented 5 years ago

Doesn't seem to be detected by smartos. [root@dime-dallas1 /var/log]# psrinfo -r smt_enabled smt_enabled=false

StevenWilliams commented 5 years ago

Looks like I got this fixed (either by enabling UEFI or C-states, going to debug this further to find the cause)...

jlevon commented 5 years ago

On Thu, Sep 26, 2019 at 08:49:16AM -0700, Steven Williams wrote:

Looks like I got this fixed (either by enabling UEFI or C-states, going to debug this further to find the cause)...

That's interesting, would be good to know what the critical difference is.

thanks john

StevenWilliams commented 5 years ago

Ok, looks like keeping UEFI but turning off C-states works. However, the video output with UEFI is all scrambled. https://i.postimg.cc/brxnNCjm/Screenshot-from-2019-09-26-14-03-16.png