open-power-host-os / qemu

OpenPOWER Host OS qemu repository
Other
2 stars 3 forks source link

qemu crashes when hotplug/unplug done continuosly with error spapr_drc_detach: assertion failed: (drc->dev) #9

Closed nasastry closed 6 years ago

nasastry commented 7 years ago

Qemu crashes with error "hw/ppc/spapr_drc.c:417:spapr_drc_detach: assertion failed: (drc->dev)" when memory hotplug and hotunplug was done continuously.

Steps to re-produce:

  1. Bring up ppc64le guest with memory hotplug capabilities ( I used libvirt xml to do this).

  2. And do continuous memory hotplug and unplug using the following memory xml (mem_hp_8g.xml)

    <memory model='dimm'>
    <target>
    <size unit='KiB'>8388608</size>
    <node>1</node>
    </target>
    </memory>
  3. Run the following for i in seq 1 100; do virsh attach-device mem_hp_8g.xml --live; virsh detach-device mem_hp_8g.xml --live; done

  4. Guest will crash

  5. Following is from qemu log

    2017-10-09 06:10:38.514+0000: starting up libvirt version: 3.6.0, package: 3.rel.gitdd9401b.el7.centos (Unknown, 2017-09-22-23:37:19, host-os-jenkins-slave02.aus.stglabs.ibm.com), qemu version: 2.10.0, hostname: zzfp365-lp1.aus.stglabs.ibm.com
    LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -name guest=virt-tests-vm1-nrs,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-virt-tests-vm1-nrs/master-key.aes -machine pseries-2.10,accel=kvm,usb=off,dump-guest-core=off -m size=8388608k,slots=32,maxmem=138412032k -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -numa node,nodeid=0,cpus=0-3,mem=4096 -numa node,nodeid=1,cpus=4-7,mem=4096 -object memory-backend-ram,id=memdimm0,size=4294967296 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0,addr=8589934592 -uuid 7c37594a-8052-4499-912a-7555033435cf -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-virt-tests-vm1-nrs/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pci-ohci,id=usb,bus=pci.0,addr=0x2 -device spapr-vscsi,id=scsi0,reg=0x2000 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/home/nasastry/hostos-3.0-ppc64le.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:4a:4b:4c,bus=pci.0,addr=0x1 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-2-virt-tests-vm1-nrs/vioser-00-00-01.sock,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channel/target/domain-2-virt-tests-vm1-nrs/vioser-00-00-02.sock,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on
    2017-10-09T06:10:38.617661Z qemu-system-ppc64: -chardev pty,id=charserial0: char device redirected to /dev/pts/4 (label charserial0)
    **
    ERROR:/builddir/build/BUILD/qemu/hw/ppc/spapr_drc.c:417:spapr_drc_detach: assertion failed: (drc->dev)
    2017-10-09 06:16:39.979+0000: shutting down, reason=crashed
Mirrored with LTC bug #159863
nasastry commented 7 years ago

guest_xml_qemu_9.txt

nasastry commented 7 years ago

Link to the fix by Daniel: https://lists.gnu.org/archive/html/qemu-ppc/2017-10/msg00221.html

cdeadmin commented 7 years ago

------- Comment From danielhb@br.ibm.com 2017-10-10 09:18:31 EDT------- FYI: this bug was reported by Nageswara here:

https://bugs.launchpad.net/qemu/+bug/1718118

He sent me the link in private in Slack asking to have a look at it. I've fixed it with

The patch was accepted and is now waiting to be pushed upstream.

Thanks,

Daniel

cdeadmin commented 6 years ago

------- Comment From jamesspo@us.ibm.com 2017-11-15 11:15:55 EDT------- This is queued up for qemu 2.11, so moving to sprint5 where that is expected to be merged in.

cdeadmin commented 6 years ago

------- Comment From bssrikanth@in.ibm.com 2017-12-20 04:38:41 EDT------- I did some tests around this one.. findings below:

  1. Brought up a guest with 8GB mem, 32GB MaxMem
  2. Hotplugged 8GB -> PASS [root@ltczzj3 srikanth]# virsh attach-device srikanth-vm1 mem_hp.xml --live Device attached successfully
  3. Hotunplugged 8GB --> FAIL [root@ltczzj3 srikanth]# virsh detach-device srikanth-vm1 mem_hp.xml --live Device detached successfully

In guest:

[root@localhost ~]# [ 76.251188] pseries-hotplug-mem: Attempting to hot-remove 32 LMB(s) at 80000020 [ 76.252805] pseries-hotplug-mem: Memory indexed-count-remove failed, adding any removed LMBs

  1. Later in Host if I run hot unplug again seeing below error

[root@ltczzj3 srikanth]# virsh detach-device srikanth-vm1 mem_hp.xml --live error: Failed to detach device from mem_hp.xml error: internal error: unable to execute QEMU command 'device_del': Memory unplug already in progress for device dimm0

Is this bug fixed in QEMU 2.11? or should we postpone verification of this bug to sprint where we actually claim support for memory hotunplug?