open-power-host-os / qemu

OpenPOWER Host OS qemu repository
Other
2 stars 3 forks source link

qemu does not default to host capabilities for cap-{sbbc,cfpc,ibs} and needs clarity on usecase scenarios of these machine property #35

Closed sathnaga closed 6 years ago

sathnaga commented 6 years ago
Mirrored with LTC bug https://bugzilla.linux.ibm.com/show_bug.cgi?id=164306 Currently the machine properties cap-{sbbc,cfpc,ibs} are defaulted to `broken` irrespective of host has workaround/broken. Host: ``` Model: 2.1 (pvr 004b 0201) Model name: POWER8E (raw), altivec supported FW: fips861/b0103b_1801.861( Has workaround for side channel) Kernel: 4.15.0-3.dev.gitd34a158.el7.centos.ppc64le qemu: qemu-2.11.50-1.dev.gita815ffa.el7.centos.ppc64le Machine details from dmesg: dmesg | grep -e 'pSeries machine' -e 'OPAL detected' -e rfi-fixups -e rfi-flush [ 0.000000] opal: OPAL detected ! [ 0.000000] rfi-flush: Using ori type flush [ 0.000000] rfi-flush: patched 9 locations ``` Guest: `Kernel: 4.15.0-3.dev.gitd34a158.el7.centos.ppc64le` Just boot a default guest: ``` /usr/bin/qemu-kvm -M pseries -enable-kvm -nographic -serial /dev/pts/4 -monitor stdio /home/sath/hostos-ppc64le.qcow2 QEMU 2.11.50 monitor - type 'help' for more information (qemu) Guest serial: [root@localhost ~]# dmesg|grep "\-flush" [ 0.000000] rfi-flush: Using fallback displacement flush [ 0.000000] rfi-flush: patched 9 locations ``` whereas boot explicitly with those side channel capabilities ``` # /usr/bin/qemu-kvm -M pseries,cap-sbbc="workaround",cap-cfpc="workaround" -enable-kvm -nographic -serial /dev/pts/4 -monitor stdio /home/sath/hostos-ppc64le.qcow2 QEMU 2.11.50 monitor - type 'help' for more information (qemu) Guest serial: # dmesg|grep "\-flush" [ 0.000000] rfi-flush: Using ori type flush [ 0.000000] rfi-flush: Using mttrig type flush [ 0.000000] rfi-flush: patched 9 locations ``` -------------------------- 2) Booting with Indirect branch serialisation apart from `broken` fails to start guest? ``` # /usr/bin/qemu-kvm -M pseries,cap-ibs="workaround" -enable-kvm -nographic -serial /dev/pts/4 -monitor stdio QEMU 2.11.50 monitor - type 'help' for more information (qemu) qemu-system-ppc64: Requested safe indirect branch capability level not supported by kvm, try a different value for cap-ibs # /usr/bin/qemu-kvm -M pseries,cap-ibs="fixed" -enable-kvm -nographic -serial /dev/pts/4 -monitor stdio QEMU 2.11.50 monitor - type 'help' for more information (qemu) qemu-system-ppc64: Requested safe indirect branch capability level not supported by kvm, try a different value for cap-ibs ``` and without FW fix, qemu-kvm is not allowing user to boot with `workaround` for cfpc and sbbc aswell. ``` This is tested with older host FW where we do not have side-channel fix.. /usr/bin/qemu-kvm -M pseries,cap-sbbc="workaround" -enable-kvm -nographic -serial /dev/pts/1 -monitor stdio QEMU 2.11.50 monitor - type 'help' for more information (qemu) qemu-system-ppc64: Requested safe bounds check capability level not supported by kvm, try a different value for cap-sbbc /usr/bin/qemu-kvm -M pseries,cap-cfpc="workaround" -enable-kvm -nographic -serial /dev/pts/1 -monitor stdio QEMU 2.11.50 monitor - type 'help' for more information (qemu) qemu-system-ppc64: Requested safe cache capability level not supported by kvm, try a different value for cap-cfpc ``` ------------- i) From this it look like qemu-kvm while starting the guest validates whether or not host has these capabilities to emulate to guest, if then why can't qemu emulates the default host capabilities available rather defaults to `broken` always? ii) If qemu-kvm has the control on these capabilities and aswell user can not override them what would be the usecase of exporting these machine properties to user and who will be using them?, provided I guess we want to keep the libvirt out of context on these capabilities?
cdeadmin commented 6 years ago

------- Comment From surajjs@au1.ibm.com 2018-02-07 17:57:33 EDT------- A quick overview of spapr-caps:

Previously qemu would query kvm capabilities from the hypervisor and would default to what the hypervisor was capable of. This was problematic as it meant that guests started with the exact same command line could be presented a different environment depending on the host they are started on.

The entire idea of spapr-caps is to NOT infer any capabilities from the host and instead require they be explicitly stated on the command line, otherwise the default will be given. The values of these caps is then checked against the host capabilities and qemu will fail to start if the requested values are unavailable.

Answering your questions:

"Booting with Indirect branch serialisation apart from broken fails to start guest?"

A: Booting with any option the host can't support will fail. If this is the case then the host cannot support anything other than broken for that option. Note that specific option depends on "fw-bcctrl-serialized" being set to enabled in the device tree so you can check that on the host if you are unsure.

"why can't qemu emulates the default host capabilities available rather defaults to broken always"

A: Because the whole idea of spapr-caps is to give a consistent guest environment for a given command line, not to default to whatever the host can do. The default is broken and so that will be used unless set otherwise.

"If qemu-kvm has the control on these capabilities and aswell user can not override them what would be the usecase of exporting these machine properties to user and who will be using them"

A: That is incorrect, the user CAN override them. If they want anything other than the default then they are required to. The idea is to provide spapr-caps to some higher level management layer which can use them to ensure consistency across a machine pool to provide migration compatibility.

As far as I can see everything raised here is the system working as expected. Unless there is actually a bug I will be closing this as notabug.

cdeadmin commented 6 years ago

------- Comment From satheera@in.ibm.com 2018-02-08 01:00:03 EDT------- (In reply to comment #2) > A quick overview of spapr-caps: > > Previously qemu would query kvm capabilities from the hypervisor and would > default to what the hypervisor was capable of. This was problematic as it > meant that guests started with the exact same command line could be > presented a different environment depending on the host they are started on. > > The entire idea of spapr-caps is to NOT infer any capabilities from the host > and instead require they be explicitly stated on the command line, otherwise > the default will be given. The values of these caps is then checked against > the host capabilities and qemu will fail to start if the requested values > are unavailable. > > Answering your questions: > > "Booting with Indirect branch serialisation apart from broken fails to > start guest?" > > A: Booting with any option the host can't support will fail. If this is the > case then the host cannot support anything other than broken for that > option. Note that specific option depends on "fw-bcctrl-serialized" being > set to enabled in the device tree so you can check that on the host if you > are unsure. > Looks like as you mentioned FW does not enable it, might be a bug for FW. ls -l /sys/firmware/devicetree/base/ibm,opal/fw-features/fw-bcctrl-serialized/ disabled name phandle

> "why can't qemu emulates the default host capabilities available rather > defaults to broken always" > > A: Because the whole idea of spapr-caps is to give a consistent guest > environment for a given command line, not to default to whatever the host > can do. The default is broken and so that will be used unless set otherwise. > > "If qemu-kvm has the control on these capabilities and aswell user can not > override them what would be the usecase of exporting these machine > properties to user and who will be using them" > > A: That is incorrect, the user CAN override them. If they want anything > other than the default then they are required to. The idea is to provide > spapr-caps to some higher level management layer which can use them to > ensure consistency across a machine pool to provide migration compatibility. >

Sure, migration of guest from unpatched host --> patched host, will work but will leave the guest in "broken" state and guest will need a change of command line and restart to get the "workaround" effective.

In other words live migration is not supported for side-channel fixes and anyways guest kernel needs a reboot, so it makes sense not to break migration work flow.

> As far as I can see everything raised here is the system working as expected. > Unless there is actually a bug I will be closing this as notabug.

Thanks for the explain :-), So libvirt and other management layers need to know about these capabilities of machine, not sure we already have support for them in libvirt, will check and raise a BZ to get that in.

P:S:- I prefer the bug to be closed as Documented to be used for documentation on these capabilities usecase.

Regards, -Satheesh.

cdeadmin commented 6 years ago

------- Comment From surajjs@au1.ibm.com 2018-02-08 19:10:13 EDT------- (In reply to comment #3) > (In reply to comment #2) [snip] > > "If qemu-kvm has the control on these capabilities and aswell user can not > > override them what would be the usecase of exporting these machine > > properties to user and who will be using them" > > > > A: That is incorrect, the user CAN override them. If they want anything > > other than the default then they are required to. The idea is to provide > > spapr-caps to some higher level management layer which can use them to > > ensure consistency across a machine pool to provide migration compatibility. > > > > Sure, migration of guest from unpatched host --> patched host, will work but > will leave the > guest in "broken" state and guest will need a change of command line and > restart to get the "workaround" > effective. > > In other words live migration is not supported for side-channel fixes and > anyways guest kernel needs a reboot, so it makes sense not to break > migration work flow.

Yes, the guest would need to be rebooted and the command line changed on the new host to get the fixes. This is because there is no way after the guest has booted to tell it it's environment has changed, as with most features.

> [snip]

cdeadmin commented 6 years ago

------- Comment From lagarcia@br.ibm.com 2018-02-28 10:22:04 EDT------- IIUC, what we have to do with this bug is to move it to Documentation component, as it hasn't been documented anywhere yet.

cdeadmin commented 6 years ago

------- Comment From seg@us.ibm.com 2018-09-14 12:52:10 EDT------- I don't really see any good fit for documenting this. Let's just close it.