berkeley-dsep-infra / jupyterhub-k8s

[Deprecated] Data 8's deployment of JupyterHub on Kubernetes
Apache License 2.0
34 stars 17 forks source link

Add support for mounting shared datasets in /data/shared #105

Closed yuvipanda closed 7 years ago

yuvipanda commented 7 years ago

To add a new dataset we'll have to:

  1. Create a new Google Cloud Persistent Disk
  2. Mount it on provisioner, and populate it with data
  3. Create a PV and PVC object in the k8s cluster (manually for now?)
  4. Set the properties for shared data in the helm values file and do an upgrade.
yuvipanda commented 7 years ago

hmm I'd very strongly not want us to attach executable type things here (should be in the container image instead). I'll think about it :D

On Wed, Feb 1, 2017 at 4:58 PM, Ryan Lovett notifications@github.com wrote:

@ryanlovett commented on this pull request.

lgtm. My only comment is that the term "data" is not strictly necessary in the contexts here, e.g. SHARED_MOUNTS, /shared/{name}, shared-{name}. Someone may attach something that isn't data, like code or executables.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/data-8/jupyterhub-k8s/pull/105#pullrequestreview-19696134, or mute the thread https://github.com/notifications/unsubscribe-auth/AAB23r2qAv8g9i2O4pJzVMeK95p9vy4xks5rYSo0gaJpZM4L0bAp .

-- Yuvi Panda T http://yuvi.in/blog

ryanlovett commented 7 years ago

But someone might use it for that purpose. I'm not saying its the right thing to do for our deployment, but as implemented, this is a mechanism for the deployer to attach shared volumes. It is up to the creator of the volume to decide whether the volume strictly contains data.

Regardless, this is more of an observation on my part and I don't want to to hold up anything. I can go ahead with the PR if you're not planning any changes.

yuvipanda commented 7 years ago

I haven't tested it at all - do you have time to test it?

On Wed, Feb 1, 2017 at 6:03 PM, Ryan Lovett notifications@github.com wrote:

But someone might use it for that purpose. I'm not saying its the right thing to do for our deployment, but as implemented, this is a mechanism for the deployer to attach shared volumes. It is up to the creator of the volume to decide whether the volume strictly contains data.

Regardless, this is more of an observation on my part and I don't want to to hold up anything. I can go ahead with the PR if you're not planning any changes.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/data-8/jupyterhub-k8s/pull/105#issuecomment-276845009, or mute the thread https://github.com/notifications/unsubscribe-auth/AAB23tU3FVUUnr7sFTXKmzeW5dSm9sFSks5rYTlfgaJpZM4L0bAp .

-- Yuvi Panda T http://yuvi.in/blog

ryanlovett commented 7 years ago

Yes, I will try it out.

Ryan

On Wed, Feb 1, 2017 at 6:04 PM, Yuvi Panda notifications@github.com wrote:

I haven't tested it at all - do you have time to test it?

On Wed, Feb 1, 2017 at 6:03 PM, Ryan Lovett notifications@github.com wrote:

But someone might use it for that purpose. I'm not saying its the right thing to do for our deployment, but as implemented, this is a mechanism for the deployer to attach shared volumes. It is up to the creator of the volume to decide whether the volume strictly contains data.

Regardless, this is more of an observation on my part and I don't want to to hold up anything. I can go ahead with the PR if you're not planning any changes.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/data-8/jupyterhub-k8s/pull/105# issuecomment-276845009, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAB23tU3FVUUnr7sFTXKmzeW5dSm9sFSks5rYTlfgaJpZM4L0bAp .

-- Yuvi Panda T http://yuvi.in/blog

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/data-8/jupyterhub-k8s/pull/105#issuecomment-276845286, or mute the thread https://github.com/notifications/unsubscribe-auth/AJxfvZUFSzhzaSEXnWJGkw_2FVwDFyc2ks5rYTnIgaJpZM4L0bAp .

ryanlovett commented 7 years ago

Via https://cloud.google.com/compute/docs/disks/add-persistent-disk:

gcloud compute disks create ${disk_name} --size 10 --type pd-ssd
gcloud compute instances attach-disk provisioner-01 --disk ${disk_name}
ls -lLrt /dev/disk/by-id/
blkdev=/dev/disk/by-id/google-persistent-disk-1
sudo mkfs.ext4 -F -E lazy_itable_init=0,lazy_journal_init=0,discard ${blkdev}
sudo mkdir -p /mnt/disks/${disk_name}
sudo mount -o discard,defaults ${blkdev} /mnt/disks/${disk_name}/

Add data to the volume.

Create PV:

cat <<EOF > pv-${name}.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-${name}
  labels:
    type: local
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadOnlyMany
  gcePersistentDisk:
    pdName: "${disk_name}"
    fsType: "ext4"
EOF
kubectl --namespace=${ns} create -f pv-${name}.yaml

Create PVC:

cat <<EOF > pvc-${name}.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-${name}
spec:
  accessModes:
    - ReadOnlyMany
  resources:
    requests:
      storage: 10Gi
EOF
kubectl --namespace=${ns} create -f pvc-${name}.yaml

Unmount and detach the volume.

umount /mnt/disks/${name}
gcloud compute instances detach-disk provisioner-01 --disk ${disk_name}