pangeo-data / pangeo

Pangeo website + discussion of general issues related to the project.
http://pangeo.io
693 stars 187 forks source link

Update jupyter_config.yaml to use initContainers for git clone? #643

Closed Boes-man closed 4 years ago

Boes-man commented 5 years ago

jupyter_config.yaml.txt I am working on use cases where its desired to use Pangeo on private Kubernetes clusters, not one of the public cloud managed service offerings such as gce. To that effect I am using Kubernetes upstream (1.14). While working on getting Pangeo provisioned I found a warning that the current method of mounting the git repo ( gitRepo: repository: "https://github.com/pangeo-data/pangeo-custom-jupyterhub-templates.git") is deprecated, as per https://kubernetes.io/docs/concepts/storage/volumes/#gitrepo

I changed my jupyter_config.yaml to use initContainers. Switching to this will allow for a wider adoption of Pangeo.

# file: jupyter_config.yaml

initContainers:
  - name: clone_git
    image: alpine
    command: ['git', 'clone', 'https://github.com/pangeo-data/pangeo-custom-jupyterhub-templates.git','/tmp/data']
    volumeMounts:
    - name: custom-templates
      mountPath: /tmp/data

jupyterhub:
  singleuser:
    extraEnv:
      EXTRA_PIP_PACKAGES: >-
      GCSFUSE_BUCKET: 
    storage:
      extraVolumes:
        - name: fuse
          hostPath:
            path: /dev/fuse
      extraVolumeMounts:
        - name: fuse
          mountPath: /dev/fuse
    cloudMetadata:
      enabled: true
    cpu:
      limit: 4
      guarantee: 1
    memory:
      limit: 14G
      guarantee: 4G

  hub:
    extraConfig:
      customPodHook: |
        from kubernetes import client
        def modify_pod_hook(spawner, pod):
            pod.spec.containers[0].security_context = client.V1SecurityContext(
                privileged=True,
                capabilities=client.V1Capabilities(
                    add=['SYS_ADMIN']
                )
            )
            return pod
        c.KubeSpawner.modify_pod_hook = modify_pod_hook
        c.JupyterHub.logo_file = '/usr/local/share/jupyter/hub/static/custom/images/logo.png'
        c.JupyterHub.template_paths = ['/usr/local/share/jupyter/hub/custom_templates/',
                                      '/usr/local/share/jupyter/hub/templates/']
    extraVolumes:
      - name: custom-templates
        emptyDir: {}
    extraVolumeMounts:
      - mountPath: /usr/local/share/jupyter/hub/custom_templates
        name: custom-templates
        subPath: "pangeo-custom-jupyterhub-templates/templates"
      - mountPath: /usr/local/share/jupyter/hub/static/custom
        name: custom-templates
        subPath: "pangeo-custom-jupyterhub-templates/assets"

  cull:
    enabled: true
    users: false
    timeout: 1200
    every: 600

  # this section specifies the IP address for pangeo
  proxy:
    service:
      loadBalancerIP: 
jhamman commented 5 years ago

ping @scottyhq as he's been doing something similar for our AWS deployments.

More generally, I wonder if we can upstream the generic parts of the jupyterhub template/logo customization to either the pangeo or zero2jupyterhub helm chart. In my experience, the current system is pretty brittle. Thoughts from @yuvipanda on what is possible?

scottyhq commented 5 years ago

Thanks @Boes-man. Didn’t realize that approach would work. There is also a pull request to add initContainers under the hub config: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/1270

yuvipanda commented 5 years ago

@jhamman I think that'd be welcome! Maybe mount it into a configmap? Or customize the hub image more easily?

Agree that initContainers are a much better fit than gitRepo containers.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.