Mirantis / virtlet

Kubernetes CRI implementation for running VM workloads
Apache License 2.0
743 stars 128 forks source link

Implement directory volume for virtlet with 9pfs #739

Closed miaoyq closed 6 years ago

miaoyq commented 6 years ago

This PR implement directory volume for virtlet with 9pfs. With this, virtlet can use the normal volume created by k8s. (Like emptyDir, hostPath ……)

Signed-off-by: Yanqiang Miao miao.yanqiang@zte.com.cn


This change is Reviewable

CLAassistant commented 6 years ago

CLA assistant check
All committers have signed the CLA.

ivan4th commented 6 years ago

Thank you for the PR! It's indeed an important contribution. There are a few quirks, though. First of all, I tried to modify ubuntu example to mount some host path into the VM:

--- a/examples/ubuntu-vm.yaml
+++ b/examples/ubuntu-vm.yaml
@@ -27,3 +27,10 @@ spec:
     # tty and stdin required for `kubectl attach -t` to work
     tty: true
     stdin: true
+    volumeMounts:
+    - name: host-logs
+      mountPath: /hostlogs
+  volumes:
+  - name: host-logs
+    hostPath:
+      path: /var/log

After the VM booted, /virtletShared was mounted, but /hostlogs wasn't. In cloud-init userdata, I saw this:

mounts:
- - virtletShared
  - /virtletShared
  - 9p
  - trans=virtio
- - virtletShared
- /hostlogs

which doesn't seem to be correct and indeed produces these fstab entries

virtletShared   /virtletShared  9p      trans=virtio,comment=cloudconfig        0       2
virtletShared   /hostlogs       auto    defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig       0       2

The second one doesn't do the expected mount.

Another thing, is it important to have just one /virtletShared 9p mount in the VM or can we use multiple mounts? I'd use multiple mounts. Also, IMO we don't need extra FileSystemList object. Instead, VMVolume interface can be changed to have

Setup() (*libvirtxml.DomainDisk, error)

instead of current

Setup() (*libvirtxml.DomainDisk, *libvirtxml.DomainFilesystem, error)

Then I'd add filesystemVolume that implements VMVolume and does proper Setup() and Teardown() for filesystem volumes, and then filesystemSource (based on https://github.com/Mirantis/virtlet/blob/master/pkg/libvirttools/flexvolume_volumesource.go) that scans kubelet mount dirs. filesystemVolume.Setup() will return nil as its first return value and the filesystem entry as its second one. It can be based upon current filesystemItem struct with some parts from filesystemList. After that, it should be possible to simplify cloudinit code that handles filesystem mounts. It would be nice to have test coverage for both filesystemVolume / filesystemVolumeSource, at least as a test case in virtualization_test.go, and for corresponding cloud-init user data generation code. Please correct me if I missed something while making these suggestions. Besides, I'll post some comments for the code itself. Some of them may be irrelevant though if the code is updated according to the above scheme.

Also, we were experiencing some CI problems after recent changes in kubeadm-dind-cluster, please rebase your PR on master to make e2e tests pass.

miaoyq commented 6 years ago

The second one doesn't do the expected mount.

@ivan4th Thanks for test and review. I tested this with VirtletCloudInitUserDataScript: "@virtlet-mount-script@" only. :-P, will fix this. Cloud you have a try with VirtletCloudInitUserDataScript: "@virtlet-mount-script@"?

Another thing, is it important to have just one /virtletShared 9p mount in the VM or can we use multiple mounts? I'd use multiple mounts.

Sorry, I'm not clear what multiple mounts means, multiple volumes, or multiple technologies(e.g. 9pfs or others)? In this PR, virtletShared is just a data transfer channel with 9p,

/var/lib/virtlet/volumes/[POD-ID] ----9p---> /virtletShared

When we want to mount a hostpath to VM, In host side: virtlet will mount the hostpath to /var/lib/virtlet/volumes/[POD-ID]/[VolumeName]

In VM side, /virtletShared/[VolumeName] will be seen. Then, cloud-init will mount /virtletShared/[VolumeName] to containerPath

Also, IMO we don't need extra FileSystemList object. Instead, VMVolume interface can be changed to have

Setup() (*libvirtxml.DomainDisk, error)

instead of current

Setup() (*libvirtxml.DomainDisk,  *libvirtxml.DomainFilesystem, error)

Did you mean Setup() (*libvirtxml.DomainDisk, *libvirtxml.DomainFilesystem, error) instead of Setup() (*libvirtxml.DomainDisk, error)?

miaoyq commented 6 years ago

Then I'd add filesystemVolume that implements VMVolume and does proper Setup() and Teardown() for filesystem volumes, and then filesystemSource (based on https://github.com/Mirantis/virtlet/blob/master/pkg/libvirttools/flexvolume_volumesource.go) that scans kubelet mount dirs.

@ivan4th I will have a try in this way.

ivan4th commented 6 years ago

Sorry, I'm not clear what multiple mounts means, multiple volumes, or multiple technologies(e.g. 9pfs or others)?

I meant multiple volumes. My suggestion is to have multiple filesystems attached via 9p instead of just single virtletShared folder. Or am I missing some problems with such approach?

Did you mean Setup() (libvirtxml.DomainDisk, libvirtxml.DomainFilesystem, error) instead of Setup() (*libvirtxml.DomainDisk, error)?

Yes, that's what I mean, sorry for the typo.

miaoyq commented 6 years ago

I meant multiple volumes. My suggestion is to have multiple filesystems attached via 9p instead of just single virtletShared folder. Or am I missing some problems with such approach?

@ivan4th This pr have supported multiple volumes, see my comment above https://github.com/Mirantis/virtlet/pull/739#issuecomment-414572590 :-)

virtlet will mount all volume that is dinfined by k8s mount to /var/lib/virtlet/volumes/[POD-ID]/[VolumeName] in host.

Because virtlet have mount /var/lib/virtlet/volumes/[POD-ID](host path) to /virtletShared (vm path) via 9pfs before. So Vm can see all contents of /var/lib/virtlet/volumes/[POD-ID]

In VM side, cloud-init will mount /virtletShared/[VolumeName] to containerPath.

ivan4th commented 6 years ago

Well, I do understand that the PR supports multiple volumes as it is. Problem is, bindmounting the volumes under proper dirs using cloud-init doesn't appear work well and is generally tricky (it may depend on the order of mounts in the particular cloud-init impl, whether it supports bindmounting at all, etc.) What I was suggesting is instead of mounting /var/lib/virtlet/volumes/[POD-ID] under /virtletShared using 9pfs we should have multiple 9pfs mounts, mounting each /var/lib/virtlet/volumes/[POD-ID]/[VolumeName] under corresponding directory inside the VM without extra bindmounting steps.

ivan4th commented 6 years ago

In VM side, cloud-init will mount /virtletShared/[VolumeName] to containerPath.

This is what appears to be somewhat tricky and fragile, so I'm suggesting multiple 9pfs filesystems instead (one 9pfs per mount) instead of just one /virtletShared for all the mounts.

miaoyq commented 6 years ago

This is what appears to be somewhat tricky and fragile, so I'm suggesting multiple 9pfs filesystems instead (one 9pfs per mount) instead of just one /virtletShared for all the mounts.

@ivan4th Ooh, Got it. Will this reduce data transmission efficiency?

ivan4th commented 6 years ago

Frankly I doubt it'll have a serious performance impact, although I can't find any direct comparisons of one 9pfs mount vs multiple ones performance-wise. Actually, first we need to make this feature work correctly, and then we can add single-mount mode as an option later if we'll be seeing some noticeable performance degradation when using multiple dirs.

miaoyq commented 6 years ago

@ivan4th I have refactored the codes based on your opinion, could you please take another look?

There are quoting issues in similar code that handles mounts in Virtlet, but I think with this extensive host path mounting it's important for us to do proper quoting here, e.g. using https://github.com/kballard/go-shellquote which is already in glide.lock

I'm not clear about this, clould you please give a example, Thx :-)

jellonek commented 6 years ago

As you are touching this part of code, you can replace all '%s' with %s then enclose all passed strings with shellquote.Join(stringVariable) as in https://github.com/Mirantis/virtlet/blob/master/pkg/config/fields.go#L315 - in both places, moved https://github.com/Mirantis/virtlet/pull/739/files#diff-369a31e4190533405038d63a652ea7a4R527 and newly introduced https://github.com/Mirantis/virtlet/pull/739/files#diff-369a31e4190533405038d63a652ea7a4R534.

miaoyq commented 6 years ago

@ivan4th Thanks for your review again. I have addressed all your comments, PTAL

ivan4th commented 6 years ago

CI fails with something failing in flexvolume code. Trying to debug ...

miaoyq commented 6 years ago

@ivan4th Is the following code useful in integration test? I found it make the integration test faild. https://github.com/Mirantis/virtlet/blob/ae5cdb4f85457681b7eda41843e4ae71378d0615/tests/integration/container_create_test.go#L171-L175

If it is just a filler, that can be defined as follow:

 mounts := []*kubeapi.Mount{}
ivan4th commented 6 years ago

Thanks! I'm inclined to merge this PR as-is now, with further changes done as followups. The current e2e breakage is due to nginx docker image problems, there's PR pending that fixes it and I created a test tmp branch to make sure this PR passes e2e with these fixes applied (it does). So the question is, are you willing to contribute unit and e2e tests for this functionality (as followup PR(s)) or should we do it ourselves? If you want to work on it, I'll give some hints on how to do this in the manner consistent with other unit/e2e tests. Thanks again!

miaoyq commented 6 years ago

So the question is, are you willing to contribute unit and e2e tests for this functionality (as followup PR(s)) or should we do it ourselves? If you want to work on it, I'll give some hints on how to do this in the manner consistent with other unit/e2e tests.

@ivan4th OK, I will work on this later, :-)