Assess the need for porting qemu-lite patches

sboeuf commented 5 years ago

Based on the recent decision to start moving to latest QEMU instead of QEMU-lite, we opened a PR to try it out here: https://github.com/kata-containers/tests/pull/1341 The results are pretty encouraging since the CI is passing.

The rationale behind the decision is that QEMU-lite is based on QEMU 2.11, which is fairly old now that QEMU is about to release 4.0. And by old, we mean potential unresolved CVEs.

The question that remains is, should we port some of the patches from QEMU-lite to the latest QEMU version that we would use.

The first reaction to this question would be to say no, since it would add some maintenance burden to port those patches on any future version, and it would potentially prevent us from testing on QEMU upstream branches if we wanted to. Basically, this would hurt the project velocity to test on latest versions of QEMU.

Nevertheless, we cannot make such decision from the top of our heads, and I would suggest that we go through a thorough review of the current QEMU-lite patches that we are carrying.

From first feedback/discussions we had, the main focus is to assess if from a metrics perspective, the removal of those patches would have any kind of impact. The second feedback was that, if there are some 9p patches, it might be okay to drop them since we will have soon a replacement for it with virtio-fs

Input needed! /cc @egernst @bergwolf @gnawux @sameo @WeiZhang555 @jodh-intel @grahamwhaley @Pennyzct @amshinde @mcastelino @jcvenegas @chavafg @devimc @lifupan

grahamwhaley commented 5 years ago

For reference, metrics results for 3.1.0 vs 2.11 are over at https://github.com/kata-containers/runtime/issues/1253#issuecomment-465685511 I'll re-post the short summary here:

A quick summary then:
 - PSS footprint the same, which was a nice surprise
 - system level scaling, maybe v3.1 is slightly better - not any worse.
 - boot to workload. oh dear. There is like a 39% bump (in the wrong direction) for v3.1 on my machine.
 - Storage for v3.1 looks worse (which I don't really know why that would be - might have to look at our 30 patches in 'lite' to see if there is something relevant there).
 - cpu cycles on network test are effectively unchanged.

jcvenegas commented 5 years ago

I'd go for qemu vanilla (or just add functional if needed). We already have a NEMU CI nemu is where more of the optimizations party is happening today, so if we want to keep having good metrics lets base them on NEMU and keep security first by tracking an updated qemu.

About 9p issues a few months ago we change the 9pfs model to 'none' and was where most of the posix issue were gone. We should rerun again the test with this new qemu and see how it works.

amshinde commented 5 years ago

I am aware that we did add a patch to bypass shared memory for VM templating. I will let @bergwolf comment if we need to carry it forward.

jodh-intel commented 5 years ago

The table below summarises which hypervisor variants are available for which architecture (please check it y'all! ;)...

architecture	qemu vanilla	qemu-lite	qemu-lite pc-lite	nemu
`aarch64`	yes	yes	no	yes
`ppc64le`	yes	yes	no	no
`s390x`	yes	yes	no	no
`x86_64`	yes	yes	yes	yes

I agree with @jcvenegas that we want to minimise branch maintenance. However, I'm not sure we can simply drop qemu-lite since we have nemu now as that would mean s390x and ppc64le would potentially suffer I think? There were also comments a while back about the boot speed of qemu-lite still (?) being significantly better than nemu but I'm unclear if that is still the case.

@rbradford - could you comment on all this?

Data for all variants on all architectures would obviously be useful to help guide our decision.

/cc @grahamwhaley, @alicefr, @nitkon.

bergwolf commented 5 years ago

I am aware that we did add a patch to bypass shared memory for VM templating. I will let @bergwolf comment if we need to carry it forward.

@amshinde Thanks for bringing it up! We need the qemu migration bypass shared memory patch to provide vm templating capability in kata containers. I'll try to send it upstream. In the meantime, until it is accepted by upstream and available in a release, we need to carry it somewhere (be it qemu-lite, third party patches in packaging or else). Please guys consider where you want to put it when you migrate from qemu-lite to upstream qemu.

sboeuf commented 5 years ago

@bergwolf ok no problem. If you can maintain a version that applies on latest QEMU version, this will be easier to apply for our CI/packaging. Also, as you mentioned, it would be really nice to get this patch merged upstream so that we would not have to port it. Did you get some feedback from the QEMU community about this patch already?

bergwolf commented 5 years ago

@sboeuf Yes, it was sent to the qemu mailing list by @laijs before [1] but he didn't get the cycles to iterate more on it. I'll continue the task this time.

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html

sboeuf commented 5 years ago

@bergwolf great :+1: But what was the global feedback there, do you think there's a good chance this will make it upstream at some point?

bergwolf commented 5 years ago

@sboeuf Yes, I think so. I feel that the general idea/design is already accepted. There was concern about security but it mostly focused on the kata use case. And I have addressed it on kata side by introducing the redeemRNG GRPC call.

sboeuf commented 5 years ago

Ok that's good to hear. Hopefully you can manage to get this upstreamed for next QEMU release (whatever will come after 4.0). In the meantime, we need the patch for 4.0 to be ready to be able to use it. Thanks for the clarifications @bergwolf

kata-containers / qemu

Assess the need for porting qemu-lite patches #6