jupyterhub / batchspawner

Custom Spawner for Jupyterhub to start servers in batch scheduled systems
BSD 3-Clause "New" or "Revised" License
190 stars 134 forks source link

batchspawner-singleuser prevents usage of off-the-shelf jupyter containers #226

Open dr-br opened 3 years ago

dr-br commented 3 years ago

Bug description

Since the advent of the wrapper script batchspawner-singleuser we can not offer our HPC users to just use off-the-shelf Jupyter container images anymore (e.g. jupyter/datascience-notebook). Instead, the containers do need to contain an installed batchspawner package, which is usually never the case if downloaded from e.g. dockerhub, nvcr.io, .... When JupyterLab stacks are loaded via Lmod (non-containerized), these stacks also need to have batchspawner to be installed.

Expected behaviour

The old version of batchspawner (0.8.1) allowed to provide JupyterHub + Slurmspawner + off-the-shelf JupyterLab inside docker containers (enroot runtime).

Actual behaviour

Things like this happen:

/var/spool/slurmd/job18872/slurm_script: line 30: batchspawner-singleuser: command not found

Your personal set up

We use a regularly installed JupyterHub + batchspawner in conjunction with JupyterLab in Docker containers using enroot runtime. The integration of enroot and slurm ist done via pyxis.

How to revert to the state as of 0.8.1 that no batchspawner is required for spawning of the (containerized or not) JupyterLab stacks?

wmoore28 commented 3 years ago

The workaround I used for this was just to rewrap the off-the-shelf containers. This was also done with gitlab-ci, so it could be done auto-magically.

ARG   BASE_IMAGE=jupyter/base-notebook:latest
FROM  $BASE_IMAGE
LABEL maintainer="Wesley Moore <wmoore@jlab.org>"

ARG   JUPYTERHUB_VERSION=1.3.0
ARG   JUPYTERLAB_VERSION=3.0.7

USER root

RUN \
  apt-get update && \
  apt-get install -yq \
    bash \
    emacs \
    git \
    slurm-wlm \
    tcsh \
    tmux \
    vim \
    && \
  rm -rf /var/lib/apt/lists/*

USER $NB_UID

RUN pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org \
        jupyterhub==${JUPYTERHUB_VERSION} \
        jupyterlab==${JUPYTERLAB_VERSION} \
        git+https://github.com/jupyterhub/batchspawner
possiblyMikeB commented 3 years ago

Additionally since the script in question doesn't depend on batchspawner, you can safely run it from inside a stock JupyterLab stack container without an issue.

So bind mount a local (or shared) copy of batchspawner-singleuser to a location like /bin/batchspawner-singleuser inside the container.

And assuming all the appropriate environment variables make their way into the running container, when executed it will function as expected.

dr-br commented 3 years ago

Thanks for the replies! @wmoore28 : Our HPC users shall be able to use any image without re-building it. So unfortunately, that is not a solution in our case. @possiblyMikeB : That could be an appropriate solution, however, I'm not sure, if there will be a Python interpreter issue with it. I will test ASAP.

Currently we simply put a python -m pip install --user batchspawner in the JupyterHub spawner prologue and that works. However, the package(s) has to be installed in an image specific volume/directory, as there occur interpreter errors, if the Python versions in the different images differ.