jupyter-server / enterprise_gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
https://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Other
623 stars 222 forks source link

Exchange files between EG kernel pods and notebook Client #875

Closed abinassahoo closed 4 years ago

abinassahoo commented 4 years ago

Hi Team,

I am using EG deployed with helm and trying to connect it from a Notebook client in docker container. Kindly help me get resolved the Below use case.

I write a code snippet in my notebook that gets executed in a kernel container. The execution is time-consuming and produces some output that my code snippet writes to a local file, i.e. a file in the kernel container. I'd like to have this output file pulled to the notebook client to put it in my persistent space, i.e. not to loose it when the kernel stops.

I am using KERNEL_NAMESPACE to create all the kernel pods in a separate namespace. Can anyone please help me clarify how this can be achieved using helm. If this can be done using a volume mount then, please suggest the process in helm.

Environment

kevin-bates commented 4 years ago

I am using KERNEL_NAMESPACE to create all the kernel pods in a separate namespace.

So all kernel pods are in a single namespace, or they're each in their own namespace? I suspect the former.

One approach would be to add the appropriate mount information into the kernel-pod.yaml file that is associated with each kernel. This yaml file is used by EG to configure the kernel pod. We refer to two kinds of mounts (briefly) in the docs - unconditional mounts (where all launched kernels would get the same mount point) and conditional mounts (where the mount information may vary per kernel - typically KERNEL_USERNAME for things like home directory mounts, etc.).

When going through this exercise, note that any "parameters" prefixed with kernel_ are associated with the uppercased environment variable KERNEL_ and all KERNEL_ environment variables are automatically flowed from the notebook client to be accessible by the target kernel during its launch.

To expose the kernelspecs so you can more easily configure the kernel-pod.yaml files, we would recommend using an NFS mount or mount the specs from an image. The deployment helm chart should support either of these. Or you could modify the helm chart to configure your own approach.

Enterprise Gateway Version : 2.3.0

Do you mean EG 2.2.0? Or are you building EG yourself and using 2.3.0.dev0?

abinassahoo commented 4 years ago

@kevin-bates Thank you for the response. Here are the answer to your questions.

So all kernel pods are in a single namespace, or they're each in their own namespace? I suspect the former.

Yes, I am using all the kernel pods under one namespace, by defining it in KERNEL_NAMESPACE env variable.

Do you mean EG 2.2.0? Or are you building EG yourself and using 2.3.0.dev0?

I am using below version of EG. gateway_version: 2.3.0.dev1 docker image : elyra/enterprise-gateway:dev

To expose the kernelspecs so you can more easily configure the kernel-pod.yaml files, we would recommend using an NFS mount or mount the specs from an image. The deployment helm chart should support either of these. Or you could modify the helm chart to configure your own approach.

Yes , I am using a custom docker image to mount the kernerspecs using the reference in deployment helm chart

Could you please help with any documentation or sample kernel-pod.yaml.j2 file with already having sample volume(Conditional and unconditional) mounted.

kevin-bates commented 4 years ago

Could you please help with any documentation or sample kernel-pod.yaml.j2 file with already having sample volume(Conditional and unconditional) mounted.

I've asked for help from the community on this - I have zero bandwidth for this at the moment. The referenced issue references some of the previous work that essentially includes an example. Should you successfully use this approach, it would be greatly appreciated if you could provide a pull request to update the documents. Thank you.

lucabem commented 4 years ago

Hi @abinassahoo - I let u here an example of what u want:

apiVersion: v1
kind: Pod
metadata:
  name: "{{ kernel_pod_name }}"
  namespace: "{{ kernel_namespace }}"
  labels:
    kernel_id: "{{ kernel_id }}"
    app: enterprise-gateway
    component: kernel
spec:
  restartPolicy: Never
  serviceAccountName: "{{ kernel_service_account_name }}"
  securityContext:
    runAsUser: 0
    runAsGroup: 0
    fsGroup: 100
  containers:
  - env:
    - name: EG_RESPONSE_ADDRESS
      value: "{{ eg_response_address }}"
    - name: KERNEL_LANGUAGE
      value: "{{ kernel_language }}"
    - name: KERNEL_SPARK_CONTEXT_INIT_MODE
      value: "{{ kernel_spark_context_init_mode }}"
    - name: KERNEL_NAME
      value: "{{ kernel_name }}"
    - name: KERNEL_USERNAME
      value: "{{ kernel_username }}"
    - name: KERNEL_ID
      value: "{{ kernel_id }}"
    - name: KERNEL_NAMESPACE
      value: "{{ kernel_namespace }}"
    - name: KERNEL_ENV
      value: "{{ kernel_env }}"
    image: "{{ kernel_image }}"
    name: "{{ kernel_pod_name }}"
    securityContext:
      capabilities:
        add: ["SYS_ADMIN"]
    workingDir: "{{ kernel_working_dir }}"
    volumeMounts:
    - name: kernelspecs
      mountPath: "/usr/local/share/jupyter/kernels"
  volumes:
  - name: kernelspecs
    nfs:
      server: <nfs-ip-server>
      path: "/usr/local/share/jupyter/kernels"
abinassahoo commented 4 years ago

Thank you for the response, this seems worked for me.