Closed mesmriti closed 7 years ago
Tested the scenario mentioned and came to a conclusion that this is very exceptional scenario observed. Since from the backend point of view we are checking the threads per core value as well for getting the SMT status this particular scenario throws an exception with threads per core value being 1 and output of /proc/cmdline containing smt=2.
As part of the fix we can remove the check for threads per core while assigning the SMT status and get the status from /proc/cmdline output.
=== Problem Description ===================================
===========================================================
I disable a vCPU on S231KP11 which was SMT2 mode before. When I refresh the hvm browser, error 'GINSMT0010E: Error occurred in fetching smt status' happens. When I open SMT edit panel, 'Current SMT Settings' is unable to grab the data and 'Persisted SMT Settings' has been changed to SMT1 automatically. So I guess there are 2 problems here:
Problem 1:
Before I disable 1 vCPU of the system.
[root@s231kp11 ~]# lscpu Architecture: s390x CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s) per book: 3 Book(s): 2 NUMA node(s): 1 Vendor ID: IBM/S390 BogoMIPS: 17006.00 Hypervisor: PR/SM Hypervisor vendor: IBM Virtualization type: full Dispatching mode: horizontal L1d cache: 128K L1i cache: 96K L2d cache: 2048K L2i cache: 2048K NUMA node0 CPU(s): 0-39
disable 1 vCPU:
[root@s231kp11 ~]# chcpu -d 7 CPU 7 disabled
[root@s231kp11 ~]# lscpu Architecture: s390x CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 8 On-line CPU(s) list: 0-6 Off-line CPU(s) list: 7 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s) per book: 3 Book(s): 2 NUMA node(s): 1 Vendor ID: IBM/S390 BogoMIPS: 17006.00 Hypervisor: PR/SM Hypervisor vendor: IBM Virtualization type: full Dispatching mode: horizontal L1d cache: 128K L1i cache: 96K L2d cache: 2048K L2i cache: 2048K NUMA node0 CPU(s): 0-39
You can see 'Threads per core' changes from 2 to 1 inside the OS. This problem has already open a defect against Zeus BaseOS image. Here is the link:
https://bugzilla.linux.ibm.com/show_bug.cgi?id=146672
I throw this question here for two purpose: a. Verify it when BaseOS image problem fixed. b. What will the HVM does when BaseOS has not been fixed yet? Leave it blank as HVM currently does? Or change to SMT1 as the internal OS does?
Problem 2:
[root@s231kp11 ~]# cat /etc/zipl.conf [defaultboot] default=4.4.0-45.66.el7_2.kvmibm1_1_3.1.s390x target=/boot [4.4.0-45.66.el7_2.kvmibm1_1_3.1.s390x] image=/boot/vmlinuz-4.4.0-45.66.el7_2.kvmibm1_1_3.1.s390x parameters="rd.zfcp=0.0.9000,0x5001738030bb0151,0x0001000000000000 rd.lvm.lv=s231kp11-zkvmvg/root root=/dev/mapper/s231kp11--zkvmvg-root vconsole.keymap=us elevator=deadline zfcp.no_auto_port_rescan=0 pci=on zfcp.allow_lun_scan=1 LANG=en_US.utf8 rd.zfcp=0.0.9000,0x5001738030bb0141,0x0001000000000000 vconsole.font=latarcyrheb-sun16 crashkernel=512M smt=2" ramdisk=/boot/initramfs-4.4.0-45.66.el7_2.kvmibm1_1_3.1.s390x.img [4.4.0-40.60.el7_2.kvmibm1_1_3.2.s390x] image=/boot/vmlinuz-4.4.0-40.60.el7_2.kvmibm1_1_3.2.s390x parameters="rd.zfcp=0.0.9000,0x5001738030bb0151,0x0001000000000000 rd.lvm.lv=s231kp11-zkvmvg/root root=/dev/mapper/s231kp11--zkvmvg-root vconsole.keymap=us elevator=deadline zfcp.no_auto_port_rescan=0 pci=on zfcp.allow_lun_scan=1 LANG=en_US.utf8 rd.zfcp=0.0.9000,0x5001738030bb0141,0x0001000000000000 vconsole.font=latarcyrheb-sun16 crashkernel=512M smt=2" ramdisk=/boot/initramfs-4.4.0-40.60.el7_2.kvmibm1_1_3.2.s390x.img [4.4.0-40.60.el7_2.kvmibm1_1_3.1.s390x] image=/boot/vmlinuz-4.4.0-40.60.el7_2.kvmibm1_1_3.1.s390x parameters="rd.zfcp=0.0.9000,0x5001738030bb0151,0x0001000000000000 rd.lvm.lv=s231kp11-zkvmvg/root root=/dev/mapper/s231kp11--zkvmvg-root vconsole.keymap=us elevator=deadline zfcp.no_auto_port_rescan=0 pci=on zfcp.allow_lun_scan=1 LANG=en_US.utf8 rd.zfcp=0.0.9000,0x5001738030bb0141,0x0001000000000000 vconsole.font=latarcyrheb-sun16 crashkernel=512M smt=2" ramdisk=/boot/initramfs-4.4.0-40.60.el7_2.kvmibm1_1_3.1.s390x.img [3.10.0-229.7.2.el7_1.kvmibm1_1_1.20.s390x] image=/boot/vmlinuz-3.10.0-229.7.2.el7_1.kvmibm1_1_1.20.s390x parameters="rd.zfcp=0.0.9000,0x5001738030bb0151,0x0001000000000000 rd.lvm.lv=s231kp11-zkvmvg/root root=/dev/mapper/s231kp11--zkvmvg-root vconsole.keymap=us elevator=deadline zfcp.no_auto_port_rescan=0 pci=on zfcp.allow_lun_scan=1 LANG=en_US.utf8 rd.zfcp=0.0.9000,0x5001738030bb0141,0x0001000000000000 vconsole.font=latarcyrheb-sun16 crashkernel=512M smt=2" ramdisk=/boot/initramfs-3.10.0-229.7.2.el7_1.kvmibm1_1_1.20.s390x.img [3.10.0-229.7.2.el7_1.kvmibm1_1_1.16.s390x] image=/boot/vmlinuz-3.10.0-229.7.2.el7_1.kvmibm1_1_1.16.s390x parameters="rd.zfcp=0.0.9000,0x5001738030bb0151,0x0001000000000000 rd.lvm.lv=s231kp11-zkvmvg/root root=/dev/mapper/s231kp11--zkvmvg-root vconsole.keymap=us elevator=deadline zfcp.no_auto_port_rescan=0 pci=on zfcp.allow_lun_scan=1 LANG=en_US.utf8 rd.zfcp=0.0.9000,0x5001738030bb0141,0x0001000000000000 vconsole.font=latarcyrheb-sun16 crashkernel=512M smt=2" ramdisk=/boot/initramfs-3.10.0-229.7.2.el7_1.kvmibm1_1_1.16.s390x.img [root@s231kp11 ~]#
You can see 'smt=2'. But hvm 'Persisted SMT Settings' has already changed to SMT1.But when I restart S231KP11, HVM persisted setting changes back to SMT2, not SMT1 as it shows before I restart.