Closed liujing2 closed 5 years ago
Hi @grahamwhaley @jcvenegas @devimc Sorry for disturbing you. I'm not sure who are the related maintainers so just ping you for some help. :) Alibaba told us this issue and I tried to fix. Thanks! Jing
No problem @liujing2 - that's what were are here for :-) I think @devimc will have most input on pci bus issues. In the meantime, you will need to do a little rework to meet our PR format requirements. You need to open an Issue and put details there about the problem, and then in your commit you will need a:
Fixes: nnn
line referencing the Issue number. See https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format for more details.
@grahamwhaley Thanks for your suggestion! I will make the commit message better after @devimc reviewing. So do I still need a new issue then?
@liujing2 - notionally, yes, you should have an Issue open in this repo you can tie the reference in the PR to - but - it doesn't look like we have Issues enabled in this repo maybe - I cannot see an Issues tab or how to make one. I think that is because maybe we have not landed any PRs in this repo since we enabled full static CI checking.... @jodh-intel @chavafg wdyt? I suspect we need to enable Issues for this repo....?
I've just enabled issues.
Note that since this repo is just a clone of the qemu one, it doesn't have any of the normal Kata infrastructure+process around it (in fact, even the Travis config seems to be from upstream).
l g t m , but @liujing2 please fix the CI
@devimc I looked into the CI results. It reports a "Build Failed" issue which said,
hw/pci/pci.c:2215:pci_add_option_rom: Object 0x561a37d1d270 is not an instance of type generic-pc-machine
Broken pipe
Locally I tested pc/q35 that both work well. Deep dived into the codes and found it probably caused not by this patch. The reported codes are as follows.
static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom,
Error **errp)
{
PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); //hw/pci/pci.c:2215
PC_MACHINE() reports the issue, but this is indeed strange because this line of code is used by many places. The parameter is just got from qdev_get_machine. So could I suggest running the CI without my patch to firstly double check if original codes also have this issue? And further, we can then take a look if the issue is caused by this line of code or CI itself?
I'm rather confused by the Travis config for this repo...
Clicking the "Details" link above takes you to a github page summarising the Travis build. At the bottom of that page is a link to the build at https://travis-ci.com/kata-containers/qemu/builds/92956237. Note that the URL contains travis-ci.com (paid for version), not travis-ci.org (free version).
I wonder if this is an accident based on us using the vanilla upstream .travis.yml
file? Anyone know anything about this?
The summary is that I don't have permissions to run the CI for the kata-containers:qemu-lite-2.11.0
branch on Travis.
afaict, what is happening is that the 'kata containers' project (top level) has Travis enabled as a 'github app', and this it appears/ties into the qemu repo 'integrations and services'. That makes it automatically pick up and process the .travis.yml file we have inherited from the qemu forking. Currently Travis for kata is configured to access 'all repositories'. I think we can probably fix this by ticking the 'only selected repositories' tickbox under the kata Travis setup, and then only selecting the ones we want to process :-) I can give that a go if we agree that is probably the right course of action...
Well, we do want qemu to be tested with Travis I think, but I don't believe we should be using travis-ci.com.
OK, this is indeed somewhat odd. Having a further prod around, it looks like we can also en/disable github repo processing on a per-repo basis on the Travis end as well. afaict, the qemu repo is not enabled for kata on the travic-ci.org - which makes me suspect maybe somebody has enabled it on travis-ci.com using their Travis account? I don't have a travis-ci.com account - if anybody here does in the kata group, can they go and check if qemu has been enabled?? Most odd - afaik we don't have a chargeable account for kata with travis-ci - so, who just paid for that build? :-)
@jodh-intel - I would suggest we disable our qemu repo from travis builds at the github end right now until we:
.travis.yml
and setup which branches get tested??travis-ci.com
access may have been setup for one of the other architectures possibly?
/cc @Pennyzct and @nitkon, @sameo, @sboeuf, @bergwolf.
Ugh - we seem to have hit a github/travis bug, or possibly we just need someone with a travis-ci.com account - I've changed the kata projects settings to only use Travis for a specified list of kata repos - basically all repos with a .travis.yml
excluding qemu
. However, although those settings appear to be "sticking" in github, if I click "save", I get redirected... to travis-ci.com, not travis-ci.org.
Could you try saving @grahamwhaley and see if you see the same behaviour?
@jodh-intel - well, I went to follow the link path ... I think you have made a change - now when I go to the 'integrations and services' on our kata/qemu on github, travis is no longer listed. So, I think it has been dropped on the github side at least... We could now try maybe enabling at https://travis-ci.org/kata-containers and see what happens??
Yes, I noticed that. You should be able to see what I'm talking about if you click "Settings" for https://github.com/kata-containers/
Right, that 'save' takes me to .com as well. Maybe you can do it from the 'other end' though - if you go to https://travis-ci.org/kata-containers then you can select which repos are processed. QEMU is not ticked there - but, maybe we can tick it there and enable the travis-ci.org to track.
I don't seem to be able to do that - I can't see any branches apart from master (which we don't actually want).
Maybe once we've identified who has the travis-ci.com a/c, they can help resolve this.
I've got a vague feeling that Travis may auto-detect branches when the server detects a change so we could try tweaking the .travis.yml
maybe (to specify the branches we do or don't want to build).
Thanks @sboeuf for reviewing and approving. @devimc Could I ask what do I need do more for this PR? Since the CI seems not ready now.
thanks @liujing2
Thanks @devimc
@liujing2 I found a strange issue here, maybe it's related to this nofw
feature?
I need to hotplug an qcow2 disk on pci-bridge-0, the qmp command return nothing(which is ok), but there's no response in guest OS (no related log in dmesg, no block device generated). I try to boot the VM without nofw
, and the hotplug works as expected.
Below is the QMP command:
(QEMU) blockdev-add cache={"direct":true,"no-flush":false} driver=qcow2 file={"driver":"file","filename":"/tmp/testhp.img"} node-name=drive-testhp
{"return": {}}
(QEMU) device_add bus=pci-bridge-0 drive=drive-testhp driver=virtio-blk-pci id=virtio-drive-testhp share-rw=on
{"return": {}}
(QEMU) query-block
{"return": [{"locked": false, "type": "unknown", "qdev": "/machine/peripheral/virtio-drive-testhp/virtio-backend", "removable": false, "device": "", "inserted": {"bps_rd": 0, "ro": false, "backing_file_depth": 0, "encrypted": false, "image": {"cluster-size": 65536, "format": "qcow2", "filename": "/tmp/testhp.img", "virtual-size": 104857600, "dirty-flag": false, "format-specific": {"data": {"compat": "1.1", "refcount-bits": 16, "corrupt": false, "lazy-refcounts": false}, "type": "qcow2"}, "actual-size": 200704}, "cache": {"no-flush": false, "writeback": true, "direct": true}, "bps_wr": 0, "drv": "qcow2", "node-name": "drive-testhp", "bps": 0, "iops": 0, "write_threshold": 0, "file": "/tmp/testhp.img", "iops_rd": 0, "encryption_key_missing": false, "detect_zeroes": "off", "iops_wr": 0}}]}
I don't know if it is appropriate to comment here, but any advice and suggestions will be greatly appreciated.
Thanks
other folks who might have thoughts: /cc @devimc @markdryan @rbradford
@YvesChan in your guest kernel you have to enable
in q35 you have to re-scan the pci bus echo 1 > /sys/bus/pci/rescan
@YvesChan in your guest kernel you have to enable
- CONFIG_HOTPLUG_PCI_SHPC
- CONFIG_HOTPLUG_PCI_ACPI
in q35 you have to re-scan the pci bus
echo 1 > /sys/bus/pci/rescan
@devimc Thanks for your reply!
I have checked the config file of guest kernel, and make sure the CONFIG_HOTPLUG_PCI_ACPI is set to "y", but CONFIG_HOTPLUG_PCI_SHPC is not set. Is it the reason why guest kernel didn't bring the device up automatically? So in this nofw
case, which hotplug device model is being used? ACPI-based or SHPC-based?
Though I'm using pc
machine type, I try to run echo 1 > /sys/bus/pci/rescan
after device_add, and the block device just come up! dmesg show these info:
[ 64.556966] clr: pci_setup_device start
[ 64.557098] pci 0000:01:00.0: [1af4:1001] type 00 class 0x010000
[ 64.557238] clr: pci_read_bases start
[ 64.557284] clr: Starting probe for 0000:01:00.0
[ 64.557374] pci 0000:01:00.0: reg 0x10: [io 0x0000-0x003f]
[ 64.557431] clr: Starting probe for 0000:01:00.0
[ 64.557527] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 64.557599] clr: pci_read_bases end
[ 64.557662] clr: pci_setup_device end
[ 64.559606] pci 0000:00:02.0: BAR 8: assigned [mem 0x80100000-0x801fffff]
[ 64.559688] pci 0000:00:02.0: BAR 7: assigned [io 0x2000-0x2fff]
[ 64.559759] pci 0000:01:00.0: BAR 1: assigned [mem 0x80100000-0x80100fff]
[ 64.559857] pci 0000:01:00.0: BAR 0: assigned [io 0x2000-0x203f]
[ 64.559937] pci 0000:00:02.0: PCI bridge to [bus 01]
[ 64.559998] pci 0000:00:02.0: bridge window [io 0x2000-0x2fff]
[ 64.560811] pci 0000:00:02.0: bridge window [mem 0x80100000-0x801fffff]
[ 64.562356] pci 0000:00:02.0: enabling device (0000 -> 0003)
[ 64.587635] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
[ 64.588205] virtio-pci 0000:01:00.0: enabling device (0000 -> 0003)
[ 64.613224] virtio-pci 0000:01:00.0: virtio_pci: leaving for legacy driver
@YvesChan
Is it the reason why guest kernel didn't bring the device up automatically?
you are using nofw
, so it's a bug, I guess related to qemu ACPI tables
So in this nofw case, which hotplug device model is being used? ACPI-based or SHPC-based?
it depends of the machine type and the bridge, q35/pc + pci-bridge = ACPI q35 + pcie-pci-bridge = SHPC
@devimc
so it's a bug, I guess related to qemu ACPI tables
Got it. Hope this bug will be fixed soon. :)
Thanks
@YvesChan Got your request and I will take time to re-produce it. Could you give your qemu commands (launching the guest) instead of kata commands so I can quickly re-produce. Thanks!
@liujing2
I'm using a script to boot VM, it's pretty easy to re-produce this issue.
$QEMU_BIN -machine pc,accel=kvm,kernel_irqchip,nvdimm,nofw \
-cpu host -m 2G,slots=2,maxmem=100G -smp 2,sockets=1,cores=2,threads=1 -numa node,nodeid=0,cpus=0-1 \
-device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on \
-kernel $KERNEL -append "$KERNEL_CMD" \
-no-user-config -nodefaults \
-object memory-backend-file,id=mem0,share,mem-path=$IMAGE,size=$IMAGE_SIZE -device nvdimm,memdev=mem0,id=nv0 \
-device virtio-serial-pci,id=virtio-serial0 \
-chardev stdio,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 \
-nographic -qmp unix:/tmp/qemu,server,nowait \
-device virtserialport,chardev=metricagent,id=channel0,name=agent.channel.0 \
-chardev socket,id=metricagent,path=/tmp/metric.agent.channel.sock,nowait,server
root=/dev/pmem0p1 rw tsc=reliable no_timer_check noreplace-smp init=/usr/lib/systemd/systemd rootflags=dax rw rootfstype=ext4 rcupdate.rcu_expedited=1 console=ttyS0 console=hvc0 console=hvc1 ignore_loglevel
After VM boot up, use QMP to send commands:
(QEMU) blockdev-add cache={"direct":true,"no-flush":false} driver=qcow2 file={"driver":"file","filename":"/tmp/testhp.img"} node-name=drive-testhp
{"return": {}}
(QEMU) device_add bus=pci-bridge-0 drive=drive-testhp driver=virtio-blk-pci id=virtio-drive-testhp share-rw=on
{"return": {}}
(QEMU) query-block
{"return": [{"locked": false, "type": "unknown", "qdev": "/machine/peripheral/virtio-drive-testhp/virtio-backend", "removable": false, "device": "", "inserted": {"bps_rd": 0, "ro": false, "backing_file_depth": 0, "encrypted": false, "image": {"cluster-size": 65536, "format": "qcow2", "filename": "/tmp/testhp.img", "virtual-size": 104857600, "dirty-flag": false, "format-specific": {"data": {"compat": "1.1", "refcount-bits": 16, "corrupt": false, "lazy-refcounts": false}, "type": "qcow2"}, "actual-size": 200704}, "cache": {"no-flush": false, "writeback": true, "direct": true}, "bps_wr": 0, "drv": "qcow2", "node-name": "drive-testhp", "bps": 0, "iops": 0, "write_threshold": 0, "file": "/tmp/testhp.img", "iops_rd": 0, "encryption_key_missing": false, "detect_zeroes": "off", "iops_wr": 0}}]}
Thanks!
@YvesChan can you try disabling shpc?
@YvesChan can you try disabling shpc?
@devimc how to disable shpc? Did you mean the kernel config "CONFIG_HOTPLUG_PCI_SHPC" or qemu-kvm boot args?
I try all the methods using "nofw" features described in your documentations ((https://gist.github.com/devimc/e9fd533e52b08387f1df65df8b19e038) , but seems none of them works without pci/pcie rescan.
Just curious about the rescan operation, is it an excepted behavior? There should be a hardware interrupt while hot plugging device, so the guest OS can handle it automatically.
how to disable shpc?
-device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=off
is it an excepted behavior?
no
There should be a hardware interrupt while hot plugging device, so the guest OS can handle it automatically
you're right
@devimc I have tried the shpc=off
option and it doesn't work :(
rescan
is still the only way to enable the qcow2 device
@YvesChan Did you try normal (with fw) hotplug on pci-bridge-0? It can work well?
@YvesChan Did you try normal (with fw) hotplug on pci-bridge-0? It can work well?
Yes, vmlinuz without "nofw" works well. I think the key part is how to generate the hardware interrupt to guest kernel while hot plugging virtio-blk device.
@YvesChan Actually my colleague and I found that we still need rescan when with fw. Even for all the pc platform hotplug, not only virtio-blk. virtio-net-pci needs to rescan too, even on pci.0, and with upstream qemu version v3.1.0. So I am curious whether I missed something...
@liujing2 That's odd. Like @devimc said, rescan should NOT be a necessary operation.
@YvesChan @devimc I found that kata-containers/vmlinuz-4.14.22.1-128.1.container is the kernel that can w/o rescan. Now I only have limited time to test w/ firmware. After CHY, I'll continue. BTW, @YvesChan could you send me your kernel config file using gist or to my email jing2.liu@intel.com? Because that kernel can not do nofw test and we need know the key difference. Thanks and happy new year!
@liujing2 That would be great. I'll send you the kernel config file to your email as soon as possible. Happy new year!
@YvesChan I didn't get your email. Have you already sent or not? I'm just afraid of wrong address.
@liujing2 Sorry for the delay. I have just sent you the config file. If you find any clues, please let me know. Thanks!
@YvesChan Hi sorry for no reply this week and I spent some time debugging this issue. Finally got why in nofw case guest can not receive the interrupt. But actually I found that the pci-bridge-0 doesn't allocate space like (nofw)
00:06.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode])
I/O behind bridge: None
Memory behind bridge: None
Prefetchable memory behind bridge: None
So we need first fix this issue. Or else we could not use the hotplugged device. I will be a little busy in these days and I would be grateful if you have fixes on this. I think this probably because pci-bridge windows space is allocated by bios. And I will also try myself to squeeze my time for this issue.
@liujing2 Thanks for your reply! I don't have much experience on BIOS/PCI but I'll try it out.
In no firmware case, hotplug onto pci.0 has problem that "Unsupported bus. Bus doesn't have property 'apic-pcihp-bsel' " on pc platform. This patch fixes the issue to enable hotplug.