hyperhq / runv

Hypervisor-based Runtime for OCI
Apache License 2.0
828 stars 129 forks source link

[RFC] Support for running runv without 9pfs #626

Open harche opened 6 years ago

harche commented 6 years ago

Hi,

We love runv for the level of isolation it offers for the critical workloads. Being able to use the docker workflow and combining it with qemu for increased isolation is just brilliant!

As you already know, runv depends on 9p filesystem for passing the container rootfs to the virtual machine. This works great, well for most of the time. But reluctance showed by distributions to officially support it makes it difficult to get this through enterprise customers and hence effectively their critical workloads which runv aims to provide higher isolation. Some of the valid concerns raised for using 9p can be found here, https://access.redhat.com/discussions/1119043.

So as a quick workaround we started thinking of ways to eliminate the use of 9p. Turns out, it's really not that simple to get rid of 9p if you want to pass a folder to the VM. But what if you bundle this rootfs folder into a qcow2 image and attach it to VM? Sure it will add to the bootup time of the VM because now you have to create a new qcow2 image of the rootfs (this can be optimized by caching the qcow2 images).

This PR has changes that add build time support for making runv work without depending on 9p. The regular flow of runv with 9p is not altered and it continues to function the way it is right now. But if you configure it with --without-9p and compile it, resulting runv binary will not use 9p. Overall the modified build instructions will look like,

$ cd $GOPATH/src/github.com/hyperhq
$ git clone https://github.com/hyperhq/runv/
$ cd runv
$ ./autogen.sh
$ ./configure --without-xen --without-9p
$ make
$ sudo make install

In order to generate qcow2 image, the added dependency here is libguestfs-tools. On ubuntu just do, apt-get install libguestfs-tools.

Of course if you want don't want to use 9p, it will require changes in hyperstart too. Right after creating this PR I will raise a PR in hyperstart project to support working without 9p. If you want to use runv with 9p, as is the case right now, the existing hyperstart will work just fine with this runv patch. The way it's implemented is, runv sends a json to hyperstart containing a key sharedDir. This is not required in case you aren't using 9p, so we set it to an empty string. This let's hyperstart know that instead of mounting 9p filesystem, mount the SCSI device that has the rootfs.

Corresponding hyperstart PR, https://github.com/hyperhq/hyperstart/pull/344

Signed-off-by: Harshal Patil harshal.patil@in.ibm.com

laijs commented 6 years ago

any update on the PR? @harche

harche commented 6 years ago

Working on it. I was focusing on https://github.com/hyperhq/hyperstart/pull/345. Now that one is merged, I will spend some time on this one.

laijs commented 6 years ago

@harche I'm struggling to find out a proper user interface for the block device support, do you have any suggestion?

Command line option like runv create --root-device xxx. Annotations like com.github.kata-containers.storage.root.path=xxx, com.github.kata-containers.storage.root.driver=qcow2 Filetype detection: check the root path to see if it is a block/qcow2/ceph-config-file/... other spec extension ...

harche commented 6 years ago

@laijs How would this work seamlessly with docker or k8s (with CRI-O)?

IMHO, the disk creation should be part of runv (maybe using some config file to determine if VM should use 9pfs or not). If you make --root-device as a separate option then it won't work out of the box with docker because that argument will be absent.

Our aim is to remove the 9pfs dependency from the entire docker workflow. So as I mentioned earlier, ideally whether to use 9pfs or use a root disk is the decision runv should take just before launching the VM maybe using a config file. If it decides to use root disk then it should create it (from rootfs dir) before proceeding ahead.

You can always have an option --root-disk as you mentioned above for the existing root disks, but I am not sure how it will get used when you plug runv with docker or k8s.

harche commented 6 years ago

@laijs it could be added during the build time, like the changes in this PR.