2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
105 stars 64 forks source link

utoronto nbgitpuller in rstudio tries to pull to /home/jovyan which is ro #2559

Closed pnasrat closed 1 year ago

pnasrat commented 1 year ago

Ref: https://2i2c.freshdesk.com/a/tickets/726

nbgitpuller doesn’t seem to be working on https://r.datatools.utoronto.ca

For example trying to pull https://github.com/ntaback/learnr_test2 into the hub with

https://r.datatools.utoronto.ca/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fntaback%2Flearnr_test2&urlpath=rstudio%2F&branch=main

Throws the error …

rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ pwd
/home/rstudio
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ ls /home/jovyan/
shared
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ id
uid=1000(rstudio) gid=1000(rstudio) groups=1000(rstudio),50(staff),100(users)
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ cd /home/jovyan/
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:/home/jovyan$ ls -al
total 9
drwxr-xr-x 3 root    root    4096 May 19 14:40 .
drwxr-xr-x 1 root    root    4096 May 19 14:40 ..
drwxr-xr-x 2 rstudio rstudio   64 Dec 13 04:51 shared
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:/home/jovyan$ 

This can be worked around by forcing a targetPath https://r.datatools.utoronto.ca/user/pris.nasrat@utoronto.ca/git-pull?repo=https%3A%2F%2Fgithub.com%2Fntaback%2Flearnr_test2.git&urlpath=rstudio%2F&branch=main&redirects=1&targetPath=/home/rstudio/foo

pnasrat commented 1 year ago

Possibly jupyterhub/nbgitpuller#268

pnasrat commented 1 year ago

nbgitpuller.json

{
  "NotebookApp": {
    "nbserver_extensions": {
      "nbgitpuller": true
    }
  }
}
jupyter serverextension list
config dir: /srv/conda/etc/jupyter
    jupyter_server_proxy  enabled 
    - Validating...
      jupyter_server_proxy  OK
    jupyter_resource_usage  enabled 
    - Validating...
      jupyter_resource_usage 0.7.1 OK
    jupyter_server_ydoc  enabled 
    - Validating...
      X is jupyter_server_ydoc importable?
    jupyterlab  enabled 
    - Validating...
      jupyterlab 3.6.3 OK
    nbgitpuller  enabled 
    - Validating...
      nbgitpuller 1.1.1 OK
    retrolab  enabled 
    - Validating...
      retrolab 0.3.21 OK

On the user server

jupyter --paths
config:
    /home/rstudio/.jupyter
    /home/rstudio/.local/etc/jupyter
    /srv/conda/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/rstudio/.local/share/jupyter
    /srv/conda/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/rstudio/.local/share/jupyter/runtime
pnasrat commented 1 year ago
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ strings /proc/1/environ  
PATH=/srv/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/texlive/bin/linux
HOSTNAME=jupyter-pris-2enasrat-40utoronto-2eca
R_VERSION=4.2.2
R_HOME=/usr/local/lib/R
TZ=Etc/UTC
CRAN=https://packagemanager.posit.co/cran/__linux__/jammy/2023-03-14
LANG=en_US.UTF-8
S6_VERSION=v2.1.0.2
RSTUDIO_VERSION=2023.03.0+386
DEFAULT_USER=rstudio
PANDOC_VERSION=default
QUARTO_VERSION=default
CTAN_REPO=https://www.texlive.info/tlnet-archive/2023/03/14/tlnet
CONDA_DIR=/srv/conda
MAMBAFORGE_VERSION=22.9.0-2
SHELL=/bin/bash
JUPYTERHUB_OAUTH_SCOPES=["access:servers!server=pris.nasrat@utoronto.ca/", "access:servers!user=pris.nasrat@utoronto.ca"]
JUPYTERHUB_SERVICE_URL=http://0.0.0.0:8888/user/pris.nasrat@utoronto.ca/
CPU_GUARANTEE=0.01
JPY_API_TOKEN=53026d55722740df8d9fab108c56a0a7
JUPYTERHUB_API_URL=http://hub:8081/hub/api
JUPYTERHUB_DEFAULT_URL=/rstudio
JUPYTERHUB_OAUTH_ACCESS_SCOPES=["access:servers!server=pris.nasrat@utoronto.ca/", "access:servers!user=pris.nasrat@utoronto.ca"]
JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES=[]
JUPYTER_RUNTIME_DIR=/tmp/.jupyter-runtime
MEM_GUARANTEE=1073741824
MEM_LIMIT=2147483648
JUPYTER_IMAGE=quay.io/2i2c/utoronto-r-image:c8cb3c4f3c31
JUPYTER_IMAGE_SPEC=quay.io/2i2c/utoronto-r-image:c8cb3c4f3c31
JUPYTERHUB_ACTIVITY_URL=http://hub:8081/hub/api/users/pris.nasrat@utoronto.ca/activity
JUPYTERHUB_API_TOKEN=53026d55722740df8d9fab108c56a0a7
JUPYTERHUB_OAUTH_CALLBACK_URL=/user/pris.nasrat@utoronto.ca/oauth_callback
JUPYTERHUB_SERVER_NAME=
JUPYTERHUB_SERVICE_PREFIX=/user/pris.nasrat@utoronto.ca/
JUPYTERHUB_USER=pris.nasrat@utoronto.ca
JUPYTERHUB_BASE_URL=/
CPU_LIMIT=4.0
JUPYTERHUB_ADMIN_ACCESS=1
JUPYTERHUB_CLIENT_ID=jupyterhub-user-pris.nasrat%40utoronto.ca
JUPYTERHUB_HOST=
PROXY_API_SERVICE_PORT=8001
PROXY_API_PORT_8001_TCP_PROTO=tcp
HUB_SERVICE_HOST=10.0.94.20
PROXY_PUBLIC_PORT_80_TCP=tcp://10.0.191.46:80
PROXY_PUBLIC_PORT_80_TCP_PORT=80
PROXY_API_PORT=tcp://10.0.209.206:8001
PROXY_API_PORT_8001_TCP_PORT=8001
PROXY_PUBLIC_PORT_80_TCP_ADDR=10.0.191.46
CONFIGURATOR_PORT_10101_TCP=tcp://10.0.240.63:10101
PROXY_PUBLIC_PORT_80_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.0.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.0.0.1
PROXY_API_PORT_8001_TCP_ADDR=10.0.209.206
HUB_PORT_8081_TCP=tcp://10.0.94.20:8081
PROXY_PUBLIC_SERVICE_HOST=10.0.191.46
PROXY_PUBLIC_SERVICE_PORT=80
PROXY_PUBLIC_PORT=tcp://10.0.191.46:80
KUBERNETES_SERVICE_HOST=10.0.0.1
PROXY_API_SERVICE_HOST=10.0.209.206
CONFIGURATOR_PORT_10101_TCP_PORT=10101
PROXY_API_PORT_8001_TCP=tcp://10.0.209.206:8001
CONFIGURATOR_SERVICE_PORT=10101
CONFIGURATOR_PORT=tcp://10.0.240.63:10101
HUB_PORT_8081_TCP_PROTO=tcp
KUBERNETES_PORT=tcp://10.0.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
CONFIGURATOR_SERVICE_HOST=10.0.240.63
CONFIGURATOR_PORT_10101_TCP_PROTO=tcp
CONFIGURATOR_PORT_10101_TCP_ADDR=10.0.240.63
HUB_PORT_8081_TCP_ADDR=10.0.94.20
PROXY_PUBLIC_SERVICE_PORT_HTTP=80
CONFIGURATOR_SERVICE_PORT_HTTP=10101
HUB_SERVICE_PORT=8081
HUB_SERVICE_PORT_HUB=8081
HUB_PORT=tcp://10.0.94.20:8081
HUB_PORT_8081_TCP_PORT=8081
HOME=/home/rstudio
rstudio@jupyter-pris-2enasrat-40utoronto-2eca:~$ strings /proc/1/cmdline 
/srv/conda/bin/python3.10
/srv/conda/bin/jupyterhub-singleuser
pnasrat commented 1 year ago
            # The default working directory is the directory from which Jupyter
            # server is launched, which is not the same as the root notebook
            # directory assuming either --notebook-dir= is used from the
            # command line or c.NotebookApp.notebook_dir is set in the jupyter
            # configuration. This line assures that all repos are cloned
            # relative to server_root_dir/<optional NBGITPULLER_PARENTPATH>,
            # so that all repos are always in scope after cloning. Sometimes
            # server_root_dir will include things like `~` and so the path
            # must be expanded.
            repo_parent_dir = os.path.join(os.path.expanduser(self.settings['server_root_dir']),
                                           os.getenv('NBGITPULLER_PARENTPATH', ''))
            repo_dir = os.path.join(repo_parent_dir, self.get_argument('targetpath', repo.split('/')[-1]))
pnasrat commented 1 year ago

Hmm ok so cwd of PID 1 which is jupyterhub singleuser is /home/jovyan. So maybe we are not seetting notebook_dir or server_root_dir correctly here.

ls -al /proc/1/cwd
lrwxrwxrwx 1 rstudio rstudio 0 May 19 15:19 /proc/1/cwd -> /home/jovyan
pnasrat commented 1 year ago
    volumeMounts:
    - mountPath: /home/rstudio
      name: home
      subPath: pris-2enasrat-40utoronto-2eca
    - mountPath: /etc/gitconfig
      name: files
      subPath: gitconfig
    - mountPath: /etc/github/github-app-private-key.pem
      name: files
      subPath: github-app-private-key.pem
    - mountPath: /usr/local/etc/ipython/ipython_kernel_config.json
      name: files
      subPath: ipython_kernel_config.json
    - mountPath: /usr/local/etc/jupyter/jupyter_notebook_config.json
      name: files
      subPath: jupyter_notebook_config.json
    - mountPath: /usr/local/etc/jupyter/jupyter_server_config.json
      name: files
      subPath: jupyter_server_config.json
    - mountPath: /home/jovyan/shared
      name: home
      readOnly: true
      subPath: _shared
    workingDir: /home/jovyan
yuvipanda commented 1 year ago

@pnasrat yep, https://github.com/2i2c-org/infrastructure/issues/2559#issuecomment-1554766238 is the root of the problem. I think we should modify r-common to set workingDir to /home/rstudio. I also felt some deja-vu with this problem, and behold, I had encountered it at UC Berkeley a few years ago and fixed it the exact same way https://github.com/berkeley-dsep-infra/datahub/blob/70447314413e15f41b1e870badae11acaff340c4/deployments/stat20/config/common.yaml#L25

We should add this as a piece of documentation somewhere, but not sure where.

pnasrat commented 1 year ago

deployed fix to r-staging and tested via

https://r-staging.datatools.utoronto.ca/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fntaback%2Flearnr_test2&urlpath=rstudio%2F&branch=main