intel / vck

Volume Controller for Kubernetes
https://ai.intel.com/kubernetes-volume-controller-kvc-data-management-tailored-for-machine-learning-workloads-in-kubernetes/
Apache License 2.0
67 stars 17 forks source link

Cachefilesd: Add end to end example for Cachefilesd #26

Closed anjalisood closed 6 years ago

anjalisood commented 6 years ago

Experiment with NFS with Cachefilesd Setup. Performance gain analysis on when the data is accessed over the network every single time vs when it is cached locally.

bhack commented 6 years ago

https://blog.riseml.com/accelerating-io-bound-deep-learning-e0e3f095fd0

anjalisood commented 6 years ago

@bhack Yes, I saw that article and am trying to get Cachefilesd setup and working through KVC

dzungductran commented 6 years ago

This is no longer possible with KVC

bhack commented 6 years ago

@dzungductran Why?

anjalisood commented 6 years ago

@bhack: The idea here was to add an end-to-end example of cachefilesd working with KVC. I tried to do a POC and was not able to get the cachefilesd installed and running in a container. This is because systemctl and systemd are not available within ubuntu/centos container images. This is by design. Docker should be running a process in the foreground in the container and it will be spawned as PID 1 within the container's pid namespace. Docker is designed for process isolation, not for OS virtualization, so there are no other OS processes and daemons running inside the container (like systemd, cron, syslog, etc), only the entrypoint or command that is run. If they included systemd commands, we would find a lot of things not working since the entrypoint replaces init. Systemd also makes use to cgroups which docker restricts inside of containers since the ability to change cgroups could allow a process to escape the container's isolation. Without systemd running as init inside the container, there's no daemon to process the start and stop commands.

And without systemd and systemctl commands in the container, it is not possible to have the cachefilesd installed in the container.

bhack commented 6 years ago

I've tried to do it at node level https://github.com/kubernetes/kops/pull/5072

bhack commented 6 years ago

I've not found problem about systemd in the container I've just found problem for the privileged pod for accessing to /dev/cachefiles

bhack commented 6 years ago

I think that for run it on unpriviledged pod a device plugin to expose /dev/cachefiles is required.

anjalisood commented 6 years ago

@bhack yeah I was able to run it successfully on node level and then use the NFS client mount in the container and that works fine. However, the issue is trying to install cachefilesd directly into the container. I tried using a centos image that had systemd...but it still did not work even though I was running with the privileged flag

anjalisood commented 6 years ago

I tried running cachefilesd on the node and then mounting /dev/cachefiles into the container and running it with privileged flag...but then that defeats the purpose :(

ashahba commented 6 years ago

Also we don't provision the nodes manually at the moment. But if further down people deploy the cluster in lab and provision it using for example Ansible or similar tools this should work, since that way admin has full control over what gets installed on the Kubernetes nodes 🙂

bhack commented 6 years ago

@anjalisood I think that it not defeats the purpose cause the daemon could run in pod and the config of the daemon has the control of the directory that you want to use for storing the cache (so it could be e PVC). The problem is that exposing /dev/cachefiles require privileges to the pod. So a device plugin to expose it is the right solution to not force to give too much privileges to the pod.

dzungductran commented 6 years ago

@bhack Have you try a device plugin?

bhack commented 6 years ago

@dzungductran No but I think the if with device plugin we can expose the nvidia GPU device for unprivileged pod I think that /dev/cachefiles could not be a limit.

dzungductran commented 6 years ago

@bhack Yes, definitely it should be doable.

bhack commented 6 years ago

Have you tried if was just enough to expose /dev/cachefiles as a raw block volume?

bhack commented 6 years ago

Seems that /dev/cachefiles is a misc character device. So I think that kubelet already recognize FileTypeCharDev

bhack commented 6 years ago

In k8s 1.10 changelog i see Fixes a bug where character devices are not recongized by the kubelet and point to https://github.com/kubernetes/kubernetes/pull/60440. @andrewsykim Do you think that it could work to expose misc character devices to the pod?

andrewsykim commented 6 years ago

I had to find this out the hard way, but unfortunately, the character device feature in Kubernetes only does validation that a device is a character device but does not provide the pod with necessary access to use it (unlike docker run --device that does). See this comment.

I've never used device plugins so can't speak on how well it works but that's likely your best bet to get this work, aside from running privileged pods.

bhack commented 6 years ago

Thanks. Other then going over implementing the regular device plugin solution I remember that @arvimal was working on enabling FS-Cache support in FUSE some years ago but I don't know if was finished.

dzungductran commented 6 years ago

If I remember correctly doesn't Pachyderm use FUSE?

ashahba commented 6 years ago

Looks like fuse-fscache haven't had any activity for over 2 years.

ashahba commented 6 years ago

Yes @dzungductran it does: https://github.com/pachyderm/pachyderm/blob/0f5759c6a00d338012bdb4ce03c28c9660cac6a0/src/server/pfs/fuse/mounter.go

bhack commented 6 years ago

Mhh.. Yes seems that the only solution is to implement the K8s device plugin for /dev/cachefiles