jupyterhub / jupyter-rsession-proxy

Jupyter extensions for running an RStudio rsession proxy
BSD 3-Clause "New" or "Revised" License
118 stars 87 forks source link

rsession-proxy on Ubuntu 20.04, 500: Internal Server Error #91

Closed vnijs closed 4 years ago

vnijs commented 4 years ago

I'm having trouble getting rsession-proxy to run with jupyter-server-proxy==1.5.0 on Docker. Using Python 3.8.2 on Ubuntu 20.04 and the code below I see the following error when running Rstudio from Jupyter Lab. No problems with jupyter-shiny-proxy and custom launchers still work fine as well (https://github.com/radiant-rstats/docker/blob/master/rsm-msba/jupyter_notebook_config.py). FYI the whole thing worked fine on Ubuntu 18.04 and the (slightly) customized fork of the pre-split version of jupyter-session-proxy (https://github.com/vnijs/jupyter-rsession-proxy)

Any suggestions on how to best debug would be wonderful. A docker image is available from vnijs/rsm-msba and jupyter runs on port 8989

docker run --rm -e JPASSWORD="" -p 8989:8989 vnijs/rsm-msba
RUN pip3 install jupyter-rsession-proxy \
  && jupyter labextension install @jupyterlab/server-proxy

image

vnijs commented 4 years ago

It seems this might be linked to the fact that I use supervisord to start jupyterlab. This was never an issue before but is now. Any guidance on what changed in jupyter (proxy) would be much appreciated.

[program:jupyterlab]
user=%(ENV_NB_USER)s
environment=HOME=/home/%(ENV_NB_USER)s,USER=%(ENV_NB_USER)s,SHELL=/bin/bash,PYTHONUSERBASE=%(ENV_PYBASE)s,JUPYTER_PATH=%(ENV_PYBASE)s/share/jupyter,JUPYTER_RUNTIME_DIR=/tmp/jupyter/runtime,JUPYTER_CONFIG_DIR=%(ENV_PYBASE)s/jupyter
command=/usr/local/bin/jupyter lab --ip=0.0.0.0 --port=8989 --allow-root --NotebookApp.token='%(ENV_JPASSWORD)s'
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
autorestart=false
vnijs commented 4 years ago

FYI Looks like it is possible to get jupyter-rsession-proxy to work with supervisord after all. Minimal example below.

I'll leave this open for now if that is ok in case any suggestions come in or I figure out how to get this to work again with the latest version of jupyterlab and jupyter-resssion-proxy

https://github.com/radiant-rstats/docker/tree/master/jupyterlab-rstudio

Dockerfile

FROM ubuntu:focal

USER root

RUN apt-get update -qq
RUN apt-get install -y --no-install-recommends \
    software-properties-common \
    gdebi-core \
    dirmngr \
    gpg-agent \
    supervisor \
    wget \
    curl \
    python3-pip

RUN echo "deb http://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list \ 
    && gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
    && gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | apt-key add - \
    && apt-get update

RUN apt install -y r-base

# ENV RSTUDIO_VERSION 1.2.5042
ENV RSTUDIO_VERSION 1.3.959

RUN wget --quiet https://download2.rstudio.org/server/bionic/amd64/rstudio-server-${RSTUDIO_VERSION}-amd64.deb
RUN gdebi -n rstudio-server-${RSTUDIO_VERSION}-amd64.deb

RUN apt-get clean && rm -rf /var/lib/apt/lists/*

RUN curl -sL https://deb.nodesource.com/setup_12.x | bash \
    && apt-get install -y nodejs \
    && npm install -g npm

RUN pip3 install jupyterlab==2.1.4
RUN pip3 install jupyter-server-proxy==1.5.0 jupyter-rsession-proxy==1.2 \
    && jupyter labextension install @jupyterlab/server-proxy

EXPOSE 8888

ENV NB_USER=jovyan
RUN useradd -m -s /bin/bash -N -u 1000 $NB_USER

COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
RUN mkdir -p /var/log/supervisor \
    && chown $NB_USER /var/log/supervisor

USER $NB_USER

## works
# CMD jupyter notebook --ip=0.0.0.0 --port=8888 --NotebookApp.token=

## also works
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

supervisord.conf file

[supervisord]
nodaemon=true
logfile=/var/log/supervisor/supervisord.log
pidfile=/tmp/supervisord.pid

[program:rserver]
command=sudo /usr/lib/rstudio-server/bin/rserver
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
startsecs=0
autorestart=false

[program:jupyterlab]
user=%(ENV_NB_USER)s
environment=HOME=/home/%(ENV_NB_USER)s,USER=%(ENV_NB_USER)s,SHELL=/bin/bash
command=/usr/local/bin/jupyter lab --ip=0.0.0.0 --port=8888 --NotebookApp.token=
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
autorestart=false
ryanlovett commented 4 years ago

Hi @vnijs, is there a reason you don't want to let jupyter-rsession-proxy manage rserver? Then you could just have your container start lab directly, without supervisord.

vnijs commented 4 years ago

@ryanlovett Good question. The full image (vnijs/rsm-msba) contains multiple services (e.g., postgres) which is what I use supervisord for. In the example I posted I can start Rstudio from Jupyter or directly. When I start directly, I can have multiple instances of Rstudio running on different ports which isn't (currently) possible with jupyter-rsession-proxy I believe.

FYI In a separate image (vnijs/rsm-jupyterhub) that we run on a server with jupyterhub we do let jupyter-rsession-proxy manage rserver.

My main issue is that I can't get any good "leads" about what Jupyter (in the vnijs/rsm-msba image) is getting stuck on when it tries to start Rstudio so this is very difficult to debug. Is there a way to get jupyter-rsession-proxy to just spit out the actual issues directly on screen during development?

ryanlovett commented 4 years ago

Ah, makes sense.

Re: debugging, it depends on where the logging is going. I recall that as an unprivileged user, I could cat or tail -f /proc/{pid}/fd/1 or /proc/{pid}/fd/2 processes from inside the container if I couldn't otherwise access where the container was logging to. You can also modify the rserver command to add logging, or maybe easier, replace rserver with a wrapper script that redirects logs and invokes the real rserver binary.

vnijs commented 4 years ago

rserver doesn't seem to have any command line arguments relevant to logging based on rserver --help. If you have any more suggestions on how I could customize the below to get some more insights I'd be most appreciative.

def _rstudio_command(port):
    return ["/usr/lib/rstudio-server/bin/rserver", "--www-port=" + str(port)]
vnijs commented 4 years ago

Made some progress. Rstudio needs sudo to launch from supervisord (see supervisord.conf below). However, if NB_USER is added to sudoers, you can no longer start rstudio from Jupyter. Removing sudo from supervisord.conf still shows the login window (user: jovyan, passwd: rsm_msba) but you can't login and start Rstudio.

cc-ing @cboettig in case he has any suggestions

Dockerfile:

FROM ubuntu:focal

USER root

RUN apt-get update -qq
RUN apt-get install -y --no-install-recommends \
    software-properties-common \
    gdebi-core \
    dirmngr \
    gpg-agent \
    supervisor \
    wget \
    curl \
    python3-pip

RUN echo "deb http://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list \ 
    && gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
    && gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | apt-key add - \
    && apt-get update

RUN apt install -y r-base

ENV RSTUDIO_VERSION 1.3.959

RUN wget --quiet https://download2.rstudio.org/server/bionic/amd64/rstudio-server-${RSTUDIO_VERSION}-amd64.deb
RUN gdebi -n rstudio-server-${RSTUDIO_VERSION}-amd64.deb

RUN apt-get clean && rm -rf /var/lib/apt/lists/*

RUN curl -sL https://deb.nodesource.com/setup_12.x | bash \
    && apt-get install -y nodejs \
    && npm install -g npm

RUN pip3 install jupyterlab==2.1.4
RUN pip3 install jupyter-server-proxy==1.5.0 jupyter-rsession-proxy==1.2 \
    && jupyter labextension install @jupyterlab/server-proxy

# EXPOSE 8888

ENV NB_USER=jovyan
RUN useradd -m -s /bin/bash -N -u 1000 $NB_USER
RUN echo "${NB_USER}:rsm_msba" | chpasswd 

# if you comment out the line below, Rstudio will not run on 8787 
# but Rstudio will run from jupyter
# if you leave it un-commented, Rstudio will run on 8787 but 
# not from jupyter
RUN adduser ${NB_USER} sudo && echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
RUN mkdir -p /var/log/supervisor \
    && chown $NB_USER /var/log/supervisor

USER $NB_USER

## works
# CMD jupyter notebook --ip=0.0.0.0 --port=8888 --NotebookApp.token=

## either Rstudio on 8787 works or Rstudio can launch from Jupyter but not both
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

supervisord.conf:

[supervisord]
nodaemon=true
logfile=/var/log/supervisor/supervisord.log
pidfile=/tmp/supervisord.pid

[program:rserver]
command=sudo /usr/lib/rstudio-server/bin/rserver
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
startsecs=0
autorestart=false

[program:jupyterlab]
user=%(ENV_NB_USER)s
environment=HOME=/home/%(ENV_NB_USER)s,USER=%(ENV_NB_USER)s,SHELL=/bin/bash
command=/usr/local/bin/jupyter lab --ip=0.0.0.0 --port=8888 --NotebookApp.token=
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
autorestart=false
ryanlovett commented 4 years ago

@vnijs Regarding debugging, you could have a wrapper script, rserver.sh, that does something like:

#!/bin/sh
exec /path/to/real/rserver "$@" > /path/to/outfile.log 2>&1

and invoke rserver.sh from your custom command. You could also insert strace or some other debugging tool into the script, assuming the requisite tracing settings are enabled in your docker run.

zeehio commented 4 years ago

The "auth revocation list" in the rstudio server needs to be writeable with 600 permissions and under a directory controlled by the user running rserver. Otherwise rserver exits and you get the internal server error.

I saw this when I tried to go to my jupyter notebook, opened a terminal and ran the /usr/lib/rstudio-server/bin/rserver command, trying with and without sudo, and checking the directories mentioned there.

This solved the issue for me:

RUN echo "auth-revocation-list-dir=/tmp/rstudio-server-revocation-list/" >> /etc/rstudio/rserver.conf

I guess something similar may work for you.

I took the idea from https://github.com/rstudio/rstudio/commit/1678627fd8649d11b8db0017bac35ddcc2ea31e5

vnijs commented 4 years ago

Thanks for the suggestion @zeehio. I think it is a permission issue ... somewhere. After lost of trial-and-error, however, I decided to stick to only launching Rstudio from jupyter. Would still be nice to have both options working but I can't find sufficient information in logs etc. to determine where the real problem is and how to fix.

@ryanlovett Thanks also for your time and input and for working on rsession-proxy!