canonical / notebook-operators

Charmed Jupyter Notebooks
Apache License 2.0
5 stars 9 forks source link

Explore running JupyterLab with an NGC container #325

Closed NohaIhab closed 10 months ago

NohaIhab commented 11 months ago

What needs to get done

Explore and document the configuration required to run a JupyterLab Notebook using an NGC container. It's expected to be a PodDefault that sets the entrypoint.

Why it needs to get done

currently, NGC containers don't run on Kubeflow Jupyter Notebooks out of the box. We want to enable CKF users to spin up a Notebook with an NGC container.

NohaIhab commented 11 months ago

by inspecting the NGC image nvcr.io/nvidia/pytorch:23.09-py3, the entrypoint is:

            "Entrypoint": [
                "/opt/nvidia/nvidia_entrypoint.sh"
            ],

looking at the script nvidia_entrypoint.sh, it prints out some text and runs scripts that perform checks on drivers (cpu, gpu, and network drivers). and we know from building notebook server rocks that the entrypoint needed to spin up a notebook is:

jupyter lab --notebook-dir="/home/jovyan" --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token="" --ServerApp.password="" --ServerApp.allow_origin="*" --ServerApp.base_url=${NB_PREFIX} --ServerApp.authenticate_prometheus=False

An important note here is the base_url arg must be set to the NB_PREFIX environment variable to be able to connect to the notebook. The env NB_PREFIX is injected in the pod when created by the notebook controller, so we know for sure it will be in the pod spec.

NohaIhab commented 11 months ago

the PodDefault that worked at the end is:

apiVersion: kubeflow.org/v1alpha1
kind: PodDefault
metadata:
  name: ngc
spec:
  args:
  - jupyter
  - lab
  - --notebook-dir
  - /home/jovyan
  - --ip
  - 0.0.0.0
  - --no-browser
  - --port
  - "8888"
  - --NotebookApp.token
  - ""
  - --NotebookApp.password
  - ""
  - --NotebookApp.allow_origin
  - '*'
  - --NotebookApp.base_url
  - $(NB_PREFIX)
  - --NotebookApp.authenticate_prometheus
  - "False"
  command:
  - /opt/nvidia/nvidia_entrypoint.sh
  desc: Configure NVIDIA NGC JupyterLab Notebook
  selector:
    matchLabels:
      ngc: "true"

the jupyter lab command needed to be modified slightly for the jupyter version in the NGC image Note: the $(NB_PREFIX) where the variable is in parentheses, this is a requirement by kubernetes to expand a variable in the args field.

NohaIhab commented 11 months ago

using the PodDefault above, and labeling the statefulset of a notebook with ngc: "true", I'm able to create a notebook with the image nvcr.io/nvidia/pytorch:23.09-py3 and connect to the notebook

kimwnasptd commented 10 months ago

The above PodDefault looks good and we also saw it live with Noha.

I'd only suggest we use a more descriptive name for desc. Maybe something like Configure NVIDIA NGC JupyterLab Notebook to make it explicit that his poddefault is for

  1. NVIDIA NGC Images
  2. JupyterLab entrypoints
syncronize-issues-to-jira[bot] commented 10 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5158.

This message was autogenerated