storage: cache NFS storage with rsync

I want to ensure that the data we provide in ~/data is able to handle the load of having hundreds of simultaneous users accessing it. This could be an issue, but it may not be an issue. I don't know that, but I want to ensure it doesn't become an issue.

Due to this, I plan to provide a cached version of the NFS servers to the participants under ~/data. Then, they will read something from a node-local disk instead than from the NFS server, which will help us avoid the risk of overloading the NFS server if for example we would need to read too much data at the same time.

Create a k8s DaemonSet (DS) that automatically creates a pod on each node that has users.
Let this DS pod mount the NFS servers /data in read only mode, and rsync copy all content to a hostPath volume every minute.
- volume mount hack to avoid permission issues, we will use rsync -a which implies that we will retain the permissions of what we copy, which means that we should ensure we do the permission fix before we copy the content.
- hostPath volumes
- rsync docker image
Let this hostPath mount to /nh/data if the user is a participant in a read only mode, and mount as NFS if its an instructor.

Motivation

The NFS server we use is a Google Cloud managed NFS server service called Filestore. We have a BASIC_HDD Filestore instance with a size of 1TB. According to this documentation we can expect a throughput of 100MB/sec. That means that if we have 100 participants that wants to read 1GB, it will take 10 second per participant, which results in 1000 seconds of wait which is too much.

Also, according to the documentation about throughput of persistent disks, its throughput scales with size and is about ~4 times higher with SSD per size. These are the numbers.

neurohackademy / nh2020-jupyterhub

storage: cache NFS storage with rsync #60

Motivation