jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.51k stars 789 forks source link

Best way to install additional Kernels? #608

Closed slecrenski closed 4 years ago

slecrenski commented 6 years ago

Documentation on the readthedocs makes no mention of how to add additional Kernels.

https://zero-to-jupyterhub.readthedocs.io/en/latest/search.html?q=kernel&check_keywords=yes&area=default

jupyterhub/k8s-singleuser-sample:v0.6 seems to only contains a Python 3 kernel.

What is the best way to install additional Kernels?

Looking for Spark, golang, R etc.

Something like this.

https://hub.docker.com/r/jupyter/all-spark-notebook/

Do I need to create my own image from a new Dockerfile?

FROM jupyter/base-notebook:27ba57364579
...
#config.yaml
singleuser:
  image: 
    name: ???
    tag: latest

Thanks. Nice framework btw.

betatim commented 6 years ago

Do I need to create my own image from a new Dockerfile?

Yes. The way to modify what is available to users in terms of kernels, notebook extensions is to switch out the singleuser image. You can probably start from an existing one like you suggested with the FROM jupyter/base-...

slecrenski commented 6 years ago

Are there special components that enable the k8s integration or will the base image suffice?

Where is the Dockerfile for the jupyterhub/k8s-singleuser-sample:v0.6 to review?

The only Kernel that seems to be available in the image is python. Why not provide a sort of Kernel package manager or add some documentation on how to get additional kernels?

Will I need to rebuild the single-user image for each upgrade of jupyterhub and only use the version specific base image that is coupled with each jupyter build?

choldgraf commented 6 years ago

Another option is to check out https://zero-to-jupyterhub.readthedocs.io/en/latest/user-environment.html#build-a-custom-docker-image-with-repo2docker . repo2docker will build a jupyterhub-ready docker image using the text files inside of a repository to install the kernel needed to run the code.

In terms of the documentation, are you expecting to see a section called "Installing non-python kernels"? (I'm trying to think here of the modification that would make this more obvious to users)

The only Kernel that seems to be available in the image is python. Why not provide a sort of Kernel package manager or add some documentation on how to get additional kernels?

repo2docker/binderhub gets you part of the way towards kernels on demand, in terms of a more generic "Kernel package manager", that sounds like a fairly complex proposition though it is a cool idea. Don't forget we're an open-source project so depend on community developer time :-)

yuvipanda commented 6 years ago

The only Kernel that seems to be available in the image is python. Why not provide a sort of Kernel package manager or add some documentation on how to get additional kernels?

We totally should add additional documentation! #499 #498 are related I think. What kinda docs would you have linked to see?

Will I need to rebuild the single-user image for each upgrade of jupyterhub and only use the version specific base image that is coupled with each jupyter build?

Yes, you will need to make sure that the JupyterHub version in the chart matches the version of JupyterHub package in your user container.

Where is the Dockerfile for the jupyterhub/k8s-singleuser-sample:v0.6 to review?

See https://github.com/jupyterhub/zero-to-jupyterhub-k8s/tree/master/images/singleuser-sample

On Wed, Mar 28, 2018 at 8:26 AM, Chris Holdgraf notifications@github.com wrote:

Another option is to check out https://zero-to-jupyterhub. readthedocs.io/en/latest/user-environment.html#build-a- custom-docker-image-with-repo2docker . repo2docker will build a jupyterhub-ready docker image using the text files inside of a repository to install the kernel needed to run the code.

In terms of the documentation, are you expecting to see a section called "Installing non-python kernels"? (I'm trying to think here of the modification that would make this more obvious to users)

The only Kernel that seems to be available in the image is python. Why not provide a sort of Kernel package manager or add some documentation on how to get additional kernels?

repo2docker/binderhub gets you part of the way towards kernels on demand, in terms of a more generic "Kernel package manager", that sounds like a fairly complex proposition though it is a cool idea. Don't forget we're an open-source project so depend on community developer time :-)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/608#issuecomment-376927707, or mute the thread https://github.com/notifications/unsubscribe-auth/AAB23rBDAxYY7KIBdcNrQiiquE9DtNf9ks5ti6uOgaJpZM4S9mcK .

-- Yuvi Panda T http://yuvi.in/blog

slecrenski commented 6 years ago

I guess I'd like to see a section titled "Installing additional kernels" Obviously in a persistent way of course. The "https://zero-to-jupyterhub.readthedocs.io/en/latest/user-environment.html#build-a-custom-docker-image-with-repo2docker" does have some useful information. Or maybe incorporate the building of the base-notebook into the helm deployment? I dunno.

Bit nervous about compatibility when I upgrade. Not too fond of having to rebuild the docker image for every upgrade.

I started playing with a potential Dockerfile here. I'll try out repo2docker and get back. I'll try out a few options.

FROM jupyter/base-notebook:28ba57364579

USER root

ARG JUPYTERHUB_VERSION=0.8
RUN pip install --trusted-host pypi.python.org --no-cache jupyterhub==$JUPYTERHUB_VERSION

RUN apt-get update && apt-get install -y gcc

RUN apt-get install -y libkrb5-dev

RUN pip install --trusted-host pypi.python.org sparkmagic

RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension

RUN pip show sparkmagic

RUN ["/bin/pwd"]

WORKDIR /opt/conda/lib/python3.6/site-packages

RUN  jupyter-kernelspec install sparkmagic/kernels/sparkkernel \
     && jupyter-kernelspec install sparkmagic/kernels/pysparkkernel \
     && jupyter-kernelspec install sparkmagic/kernels/pyspark3kernel \
     && jupyter-kernelspec install sparkmagic/kernels/sparkrkernel

WORKDIR /home/jovyan

RUN jupyter serverextension enable --py sparkmagic

USER jovyan
choldgraf commented 6 years ago

@slecrenski note that the rebuild should only be needed if the version of JupyterHub that the Helm Chart uses becomes different from the one in your image. @yuvipanda correct me if I'm wrong on that one.

I like the idea of "add your own kernels". @slecrenski let us know when you've got something that works nicely for you, and perhaps we can use this to prototype a docs addition

betatim commented 6 years ago

Having some more concrete examples of modified user environments would be good.

How would a "install additional kernels" section look though? It feels like the only generic advice we can give is "follow the instructions in the documentation for the kernel you want to install". Maybe link out to an externally maintained example of adding a not-python based kernel to a docker image (for jupyterhub)?

tracek commented 6 years ago

I have already a pretty rich setup that involves extra kernels (e.g. Apache Toree for Spark), plotting widgets support (bokeh, plotly) and RStudio server. It is based on datascience notebook, but inherits from minimal notebook, as the former works on e.g. old protobuf (and I need new one for Tensorflow 1.6).

How about I document how it works? It's pretty obvious for anyone familiar with Docker, but:

  1. Not everyone reaching z2jh are and
  2. Even if you are, you might not know about some amazing things like RStudio integration.

Let me know if you would find it useful!

jgerardsimcock commented 6 years ago

@tracek What is the Rstudio Integration?

Would this not be solved with a dropdown of different docker images? https://github.com/jupyterhub/kubespawner/pull/137

h4gen commented 5 years ago

Hey everybody,

I wrote a few lines on how to integrate Spark on Kubernetes into Jupyterhub at #1030. Maybe this helps somebody who stumbles upon this issue. Feel free to leave feedback.

farzadz commented 5 years ago

Is there any easy way to add python2.7 kernel to the single user image? I tried all the above ways but the kernel does not appear and seems not to be installed. My latest attempt is:

FROM jupyterhub/singleuser:latest
USER root
RUN apt-get update -y
RUN apt-get install -y yes
RUN apt-get install -y build-essential

RUN chown -R jovyan /home/jovyan/
RUN chmod g+s /home/jovyan/

WORKDIR /

RUN conda update -n base conda && \
        conda clean -tipsy

RUN conda create -n py27 python=2.7 anaconda

RUN ["/bin/bash", "-c" , "source activate py27 && ipython kernel install && source deactivate"]

USER jovyan
WORKDIR /home/jovyan/

I'm mounting a directory from the host to the container with a prespawn hook.

consideRatio commented 5 years ago

Im not sure, but perhaps there is details like this to be learned from the jupyter/docker-stacks repo. The images created from Dockerfile's in that repo works fine with z2jh and have multiple kernels installed

manics commented 5 years ago

https://ipython.readthedocs.io/en/latest/install/kernel_install.html#kernels-for-python-2-and-3 suggests python -m ipykernel install --user

farzadz commented 5 years ago

@manics @consideRatio Thank you for your replies. I have successfully added nodejs and python2 kernels with the following Dockerfile:

FROM jupyterhub/singleuser:latest

USER root
RUN apt-get update -y
RUN apt-get install -y build-essential sudo git
RUN apt-get install -y yes

RUN chown -R jovyan /home/jovyan/
RUN chmod g+s /home/jovyan/

RUN conda update -n base conda && \
        conda clean -tipsy

# Installing nodejs kernel
RUN sudo apt install -y npm && \
    yes | sudo npm install -g ijavascript && \
    ijsinstall --install=global

# Installing R kernel
RUN conda install -c r r-irkernel

# Installing jupyterlab
RUN conda install -c conda-forge jupyterlab

# ipythonwidgets installation
RUN pip install ipywidgets
RUN jupyter nbextension enable --py widgetsnbextension
RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager

# Map libraries installation
RUN mkdir -p ~/R/library
RUN pip install folium

#RUN R -e 'install.packages("sf", dependencies= TRUE, \
#  repos="http://cran.us.r-project.org", lib="~/R/library")'
#
#RUN R -e 'install.packages("maps", dependencies= TRUE, repos="http://cran.us.r-project.org", lib="~/R/library")'

# Git extension installation
RUN jupyter labextension install @jupyterlab/git
RUN pip install -e git+https://github.com/jupyterlab/jupyterlab-git.git#egg=jupyterlab_git
RUN jupyter serverextension enable --py jupyterlab_git --sys-prefix

# Adding python2 kernel
RUN conda create -n py27 python=2.7 anaconda
WORKDIR /
RUN ["/bin/bash" , "-c", "source activate py27 && \
    python -m pip install ipykernel && \
    python -m ipykernel install && \
    source deactivate"]
WORKDIR /home/jovyan

RUN chmod -R 755 /home/jovyan/

Nodejs and python2 kernels work fine. However, R kernel keeps restarting and hangs forever. Any ideas about what could be done?

betatim commented 5 years ago

@choldgraf

manics commented 4 years ago

I'm closing this as it's quite old and it's a support issue related to the singleuser side, not Z2JH. If you're still having problems please post on the Jupyter Community Forum. Thanks!

Sieboldianus commented 1 year ago

I am adding my comment here after working for 3 days on adding another conda Kernel to the JupyterLab base images.

The startup CMD for JupyterHub start-singleuser.sh (and start.sh) appear to be applying some modifications that would remove my conda kernelspecs installed in my dockerimage:

# Install kernelspec system-wide
RUN  /opt/conda/envs/worker_env/bin/python \
     -m ipykernel install --prefix=/usr/local

In JupyterHub/Kubernetes, the kernel spec was not linked and I had to re-run the above, so the user could select worker_env. Kubernetes config:

singleuser:
  image:
    # You should replace the "latest" tag with a fixed version from:
    # https://hub.docker.com/r/jupyter/datascience-notebook/tags/
    # Inspect the Dockerfile at:
    # https://github.com/jupyter/docker-stacks/tree/HEAD/datascience-notebook/Dockerfile
    name: registry.gitlab.mydomain.org/my/custom/singleuser-user-k8s
    tag: v0.1.10
    # `cmd: null` allows the custom CMD of the Jupyter docker-stacks to be used
    # which performs further customization on startup.
  cmd: null

My solution was to modify start-singleuser.sh:

start-singleuser.sh ```bash #!/bin/bash # Copyright (c) Jupyter Development Team. # Distributed under the terms of the Modified BSD License. set -e # set default ip to 0.0.0.0 if [[ "${NOTEBOOK_ARGS} $*" != *"--ip="* ]]; then NOTEBOOK_ARGS="--ip=0.0.0.0 ${NOTEBOOK_ARGS}" fi # link worker env kernel /bin/bash -c '/opt/conda/envs/worker_env/bin/python -m ipykernel \ install --user --name worker_env \ --display-name="worker_env"' # shellcheck disable=SC1091,SC2086 . /usr/local/bin/start.sh jupyterhub-singleuser ${NOTEBOOK_ARGS} "$@" ```

.. and copy/overwrite it in my Dockerfile:

FROM jupyterhub/k8s-singleuser-sample:1.1.3-n269.hf706eaa2
...
# add additional conda env (worker_env)
...
USER ${NB_UID}
...
COPY start-singleuser.sh /usr/local/bin/
...
# Switch back to jovyan to avoid accidental container runs as root
USER ${NB_UID}

WORKDIR "${HOME}"

Now when a user spawns a JupyterLab, they see the env in the list without doing anything: image

Here is the full Dockerfile, for reference.