jupyterhub / batchspawner

Custom Spawner for Jupyterhub to start servers in batch scheduled systems
BSD 3-Clause "New" or "Revised" License
190 stars 134 forks source link

Jupytehrhub not able to talk to notebook running in the compute node (SLURM) #233

Open pelacables opened 2 years ago

pelacables commented 2 years ago

Bug description

When I strart a server the jobs gets scheduled and successfully starts but the hub cannot talk to it.

Expected behaviour

The hub knows about the server running in the node and the server starts.

Actual behaviour

From the job logs:

[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] Serving notebooks from local directory: /home/bria
[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] Jupyter Server 1.15.6 is running at:
[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] http://computenode.domain.com:59640/user/arnau.bria@domain.com/lab
[I 2022-03-31 15:53:10.849 SingleUserLabApp serverapp:2674]  or http://127.0.0.1:59640/user/arnau.bria@domain.com/lab
[I 2022-03-31 15:53:10.849 SingleUserLabApp serverapp:2675] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2022-03-31 15:53:10.857 SingleUserLabApp mixins:596] Updating Hub with activity every 300 seconds
[D 2022-03-31 15:53:10.857 SingleUserLabApp mixins:558] Notifying Hub of activity 2022-03-31T13:53:10.600560Z

From the hub logs:

Mar 31 15:53:02 hubserver jupyterhub.service: [D 2022-03-31 15:53:02.528 JupyterHub batchspawner:288] Spawner querying job: sudo -E -u arnau.bria@domain.com squeue -h -j 14270408 -o '%T %B'
Mar 31 15:53:02 hubserver systemd: Created slice User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Starting User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Started Session c2756 of user bria.
Mar 31 15:53:02 hubserver systemd: Starting Session c2756 of user bria.
Mar 31 15:53:02 hubserver systemd: Removed slice User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Stopping User Slice of bria.
Mar 31 15:53:06 hubserver jupyterhub.service: [D 2022-03-31 15:53:06.512 JupyterHub base:281] Recording first activity for <APIToken('b07c...', user='arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:06 hubserver jupyterhub.service: [D 2022-03-31 15:53:06.517 JupyterHub scopes:301] Authenticated with token <APIToken('b07c...', user='arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:06 hubserver jupyterhub.service: [I 2022-03-31 15:53:06.520 JupyterHub log:189] 200 POST /hub/api/batchspawner (arnau.bria@domain.com@148.191.97.103) 10.94ms
Mar 31 15:53:10 hubserver jupyterhub.service: [I 2022-03-31 15:53:10.848 JupyterHub log:189] 200 GET /hub/api (@148.191.97.103) 0.85ms
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.861 JupyterHub scopes:301] Authenticated with token <APIToken('b07c...', user='arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.862 JupyterHub scopes:491] Checking access via scope users:activity
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.862 JupyterHub scopes:402] Argument-based access to /hub/api/users/arnau.bria@domain.com/activity via users:activity
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.864 JupyterHub users:855] Activity for user arnau.bria@domain.com: 2022-03-31T13:53:10.600560Z
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.864 JupyterHub users:873] Activity on server arnau.bria@domain.com/: 2022-03-31T13:53:10.600560Z
Mar 31 15:53:10 hubserver jupyterhub.service: [I 2022-03-31 15:53:10.870 JupyterHub log:189] 200 POST /hub/api/users/arnau.bria@domain.com/activity (arnau.bria@domain.com@148.191.97.103) 11.01ms

How to reproduce

Request a job using the batchspawner.

Your personal set up

alembic 1.7.7 anyio 0.0.0 argon2-cffi 21.3.0 argon2-cffi-bindings 0.0.0 asttokens 2.0.5 async-generator 1.10 attrs 21.4.0 Babel 2.9.1 backcall 0.2.0 batchspawner 1.1.0 beautifulsoup4 4.10.0 bleach 4.1.0 certifi 2021.10.8 certipy 0.1.3 cffi 1.15.0 charset-normalizer 2.0.12 colorama 0.4.4 cryptography 36.0.2 debugpy 1.5.1 decorator 5.1.1 defusedxml 0.7.1 deprecation 2.1.0 entrypoints 0.4 executing 0.0.0 flit_core 3.6.0 gitdb 4.0.9 GitPython 3.1.27 greenlet 1.1.2 idna 3.3 iniconfig 0.0.0 ipykernel 6.9.2 ipython 8.1.1 ipython-genutils 0.2.0 ipywidgets 7.6.5 jedi 0.18.1 Jinja2 3.0.3 json5 0.9.6 jsonschema 0.0.0 jupyter-client 7.1.2 jupyter-core 4.9.2 jupyter-packaging 0.11.1 jupyter-resource-usage 0.6.1 jupyter-server 1.15.6 jupyter-server-mathjax 0.2.3 jupyter-telemetry 0.1.0 jupyterhub 2.2.2 jupyterlab 3.3.2 jupyterlab-git 0.34.2 jupyterlab-launcher 0.13.1 jupyterlab-pygments 0.1.2 jupyterlab-server 2.11.2 jupyterlab-widgets 1.0.2 Mako 1.2.0 MarkupSafe 2.1.1 matplotlib-inline 0.1.3 mistune 0.8.4 nbclassic 0.3.7 nbclient 0.5.13 nbconvert 6.4.4 nbdime 3.1.1 nbformat 5.2.0 nest-asyncio 1.5.4 notebook 6.4.10 notebook-shim 0.1.0 oauthenticator 14.2.0 oauthlib 3.2.0 packaging 21.3 pamela 1.0.0 pandocfilters 1.5.0 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 pip 22.0.4 pluggy 0.0.0 poetry 1.1.13 poetry-core 1.0.8 prometheus-client 0.13.1 prompt-toolkit 3.0.28 psutil 5.9.0 ptyprocess 0.7.0 pure-eval 0.0.0 py 1.10.0 pycparser 2.21 Pygments 2.11.2 PyJWT 2.3.0 pyOpenSSL 22.0.0 pyparsing 3.0.7 pyrsistent 0.18.1 pytest 0.0.0 python-dateutil 0.0.0 python-json-logger 2.0.2 pytz 2022.1 pyzmq 22.3.0 requests 2.27.1 ruamel.yaml 0.17.21 ruamel.yaml.clib 0.2.6 semantic-version 2.9.0 Send2Trash 1.8.0 setuptools 60.10.0 setuptools-rust 1.1.2 six 1.16.0 smmap 3.0.1 sniffio 1.2.0 soupsieve 2.3.1 SQLAlchemy 1.4.32 stack-data 0.0.0 terminado 0.13.3 testpath 0.6.0 toml 0.10.2 tomlkit 0.10.0 tornado 6.1 traitlets 5.1.1 typing-extensions 3.10.0.2 urllib3 1.26.9 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 1.3.1 wheel 0.37.1 wrapspawner 1.0.1


- <details><summary>Configuration</summary>
<!--
For JupyterHub, especially include information such as what Spawner and Authenticator are being used.
Be careful not to share any sensitive information.
You can paste jupyterhub_config.py below.
To exclude lots of comments and empty lines from auto-generated jupyterhub_config.py, you can do:
    grep -v '\(^#\|^[[:space:]]*$\)' jupyterhub_config.py
-->

```python
# Configuration file for jupyterhub.
c.Spawner.env_keep = [ 'PATH',
'PYTHONPATH',
'CONDA_ROOT',
'CONDA_DEFAULT_ENV',
'VIRTUAL_ENV',
'LANG',
'LC_ALL',
'R_LIBS_SITE',
'LD_LIBRARY_PATH',
'LIBRARY_PATH',
'SSL_CERT_DIR',
'CA_BUNDLE',
'SSL_CERT_FILE',
'JUPYTERLAB_DIR',
'JUPYTER_CONFIG_PATH',
'JUPYTER_CONFIG_DIR',
'JUPYTERHUB_API_URL',
'JUPYTERHUB_BASE_URL',
'JUPYTERHUB_CLIENT_ID',
'JUPYTERHUB_OAUTH_CALLBACK_URL',
'JUPYTERHUB_SERVER_NAME',
'JUPYTERHUB_SERVICE_PREFIX',
'JUPYTERHUB_USER'
]

# Uncomment if you want to debug
c.Spawner.args = ['--debug' ]
c.Application.log_level = 'DEBUG'
c.ConfigurableHTTPProxy.debug = True

# Configurable proxy
c.JupyterHub.proxy_class = 'jupyterhub.proxy.ConfigurableHTTPProxy'
c.JupyterHub.ssl_cert = '/etc/ssl/hubserver.domain.com.cert'
c.JupyterHub.ssl_key = '/etc/ssl/hubserver.domain.com.key'

# OAuth
c.PAMAuthenticator.open_sessions = False
import os
from oauthenticator.azuread import AzureAdOAuthenticator
c.JupyterHub.authenticator_class = AzureAdOAuthenticator

c.AzureAdOAuthenticator.tenant_id = os.environ.get('AAD_TENANT_ID')

c.AzureAdOAuthenticator.oauth_callback_url = 'https://inh-bio-jupyterhub-test.domain.com/hub/oauth_callback'
c.AzureAdOAuthenticator.client_id = '123'
c.AzureAdOAuthenticator.client_secret = 'secret'
c.AzureAdOAuthenticator.scope = ['openid']
c.AzureAdOAuthenticator.username_claim = 'preferred_username'

# Multiple servers
c.JupyterHub.allow_named_servers = True
c.JupyterHub.named_server_limit_per_user = 5

##BatchSpawner
#
# SLURM~# this is needed otherwise it does not use fqdn
c.SlurmSpawner.state_exechost_exp = r'\1.domain.com'
import batchspawner
c.Spawner.start_timeout=300
c.JupyterHub.bind_url = 'https://inh-bio-jupyterhub-test.domain.com/'
#c.JupyterHub.default_url = 'home'
c.JupyterHub.hub_ip = 'inh-bio-jupyterhub-test.domain.com'
#
c.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
c.Spawner.http_timeout = 300
c.ProfilesSpawner.profiles = [
   ("Local server", 'local', 'jupyterhub.spawner.LocalProcessSpawner', {'ip':'0.0.0.0'} ),
   ('Interactive Cluster - 2 cores, 4 GB, 8 hours', 'lab', 'batchspawner.SlurmSpawner',
     dict(req_nprocs='2', req_partition='cpu',req_prologue='ml load Python/3.9.5-GCCcore-10.3.0-jupyter',req_keepvars='ALL')),
   ('Interactive Cluster - 12 cores, 48 GB, 8 hours', 'lab', 'batchspawner.SlurmSpawner',
     dict(req_nprocs='12', req_partition='cpu',req_prologue='ml load Python/3.9.5-GCCcore-10.3.0-jupyter',req_keepvars='ALL')),
]

Hub logs (148.191.97.103 resolves to computenode).

Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.103 JupyterHub scopes:491] Checking access via scope servers
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.103 JupyterHub scopes:402] Argument-based access to /hub/spawn via servers
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.104 JupyterHub pages:255] Triggering spawn with supplied form options for arna
u.bria@domain.com
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.104 JupyterHub base:931] Initiating spawn for arnau.bria@domain.
com
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.104 JupyterHub base:935] 0/100 concurrent spawns
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.104 JupyterHub base:940] 0 active servers
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.107 JupyterHub roles:477] Checking token permissions against requested role se
rver
Mar 31 15:52:49 hubserver jupyterhub.service: [I 2022-03-31 15:52:49.108 JupyterHub roles:482] Adding role server to token: <APIToken('b07c...', us
er='arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:52:49 hubserver jupyterhub.service: [I 2022-03-31 15:52:49.120 JupyterHub provider:607] Creating oauth client jupyterhub-user-arnau.bria%
40domain.com
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.136 JupyterHub user:728] Calling Spawner.start for arnau.bria@domain.com
Mar 31 15:52:49 hubserver jupyterhub.service: [W 2022-03-31 15:52:49.138 JupyterHub spawner:230] Setting Spawner.server for {self._log_name} with n
o underlying orm_spawner
Mar 31 15:52:49 hubserver jupyterhub.service: [I 2022-03-31 15:52:49.149 JupyterHub batchspawner:262] Spawner submitting job using sudo -E -u arnau
.bria@domain.com sbatch --parsable
Mar 31 15:52:49 hubserver jupyterhub.service: [I 2022-03-31 15:52:49.149 JupyterHub batchspawner:263] Spawner submitted script:
Mar 31 15:52:49 hubserver jupyterhub.service: #!/bin/bash
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --output=/home/bria/jupyterhub_slurmspawner_%j.log
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --job-name=spawner-jupyterhub
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --chdir=/home/bria
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --export=ALL
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --get-user-env=L
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --partition=cpu
Mar 31 15:52:49 hubserver jupyterhub.service: #SBATCH --cpus-per-task=2
Mar 31 15:52:49 hubserver jupyterhub.service: set -euo pipefail
Mar 31 15:52:49 hubserver jupyterhub.service: trap 'echo SIGTERM received' TERM
Mar 31 15:52:49 hubserver jupyterhub.service: ml load Python/3.9.5-GCCcore-10.3.0-jupyter
Mar 31 15:52:49 hubserver jupyterhub.service: which jupyterhub-singleuser
Mar 31 15:52:49 hubserver jupyterhub.service: srun batchspawner-singleuser jupyterhub-singleuser --debug
Mar 31 15:52:49 hubserver jupyterhub.service: echo "jupyterhub-singleuser ended gracefully"
Mar 31 15:52:49 hubserver systemd: Created slice User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Starting User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Started Session c2731 of user bria.
Mar 31 15:52:49 hubserver systemd: Starting Session c2731 of user bria.
Mar 31 15:52:49 hubserver jupyterhub.service: [I 2022-03-31 15:52:49.240 JupyterHub batchspawner:266] Job submitted. cmd: sudo -E -u arnau.bria@domain.com sbatch --parsable output: 14270408
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.242 JupyterHub batchspawner:288] Spawner querying job: sudo -E -u arnau.bria@domain.com squeue -h -j 14270408 -o '%T %B'
Mar 31 15:52:49 hubserver systemd: Removed slice User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Stopping User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Created slice User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Starting User Slice of bria.
Mar 31 15:52:49 hubserver systemd: Started Session c2732 of user bria.
Mar 31 15:52:49 hubserver systemd: Starting Session c2732 of user bria.
Mar 31 15:52:49 hubserver jupyterhub.service: [D 2022-03-31 15:52:49.291 JupyterHub batchspawner:394] Job 14270408 still pending
[... once the job starts...]
Mar 31 15:53:02 hubserver systemd: Created slice User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Starting User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Started Session c2756 of user bria.
Mar 31 15:53:02 hubserver systemd: Starting Session c2756 of user bria.
Mar 31 15:53:02 hubserver systemd: Removed slice User Slice of bria.
Mar 31 15:53:02 hubserver systemd: Stopping User Slice of bria.
Mar 31 15:53:06 hubserver jupyterhub.service: [D 2022-03-31 15:53:06.512 JupyterHub base:281] Recording first activity for <APIToken('b07c...', use
r='arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:06 hubserver jupyterhub.service: [D 2022-03-31 15:53:06.517 JupyterHub scopes:301] Authenticated with token <APIToken('b07c...', user=
'arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:06 hubserver jupyterhub.service: [I 2022-03-31 15:53:06.520 JupyterHub log:189] 200 POST /hub/api/batchspawner (arnau.bria@domain.com@148.191.97.103) 10.94ms
Mar 31 15:53:10 hubserver jupyterhub.service: [I 2022-03-31 15:53:10.848 JupyterHub log:189] 200 GET /hub/api (@148.191.97.103) 0.85ms
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.861 JupyterHub scopes:301] Authenticated with token <APIToken('b07c...', user=
'arnau.bria@domain.com', client_id='jupyterhub')>
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.862 JupyterHub scopes:491] Checking access via scope users:activity
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.862 JupyterHub scopes:402] Argument-based access to /hub/api/users/arnau.bria@
domain.com/activity via users:activity
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.864 JupyterHub users:855] Activity for user arnau.bria@domain.co
m: 2022-03-31T13:53:10.600560Z
Mar 31 15:53:10 hubserver jupyterhub.service: [D 2022-03-31 15:53:10.864 JupyterHub users:873] Activity on server arnau.bria@domain.c
om/: 2022-03-31T13:53:10.600560Z
Mar 31 15:53:10 hubserver jupyterhub.service: [I 2022-03-31 15:53:10.870 JupyterHub log:189] 200 POST /hub/api/users/arnau.bria@domain.com/activity (arnau.bria@domain.com@148.191.97.103) 11.01ms
Mar 31 15:57:19 hubserver jupyterhub.service: [D 2022-03-31 15:57:19.071 JupyterHub proxy:821] Proxy: Fetching GET http://127.0.0.1:8001/api/routes
Mar 31 15:57:19 hubserver jupyterhub.service: 15:57:19.073 [ConfigProxy] #033[32minfo#033[39m: 200 GET /api/routes
Mar 31 15:57:19 hubserver jupyterhub.service: [D 2022-03-31 15:57:19.074 JupyterHub proxy:346] Checking routes
Mar 31 15:57:49 hubserver jupyterhub.service: [W 2022-03-31 15:57:49.144 JupyterHub user:807] arnau.bria@domain.com's server failed to start in 300 seconds, giving up.
Mar 31 15:57:49 hubserver jupyterhub.service: Common causes of this timeout, and debugging tips:
Mar 31 15:57:49 hubserver jupyterhub.service: 1. Everything is working, but it took too long.
Mar 31 15:57:49 hubserver jupyterhub.service: To fix: increase `Spawner.start_timeout` configuration
Mar 31 15:57:49 hubserver jupyterhub.service: to a number of seconds that is enough for spawners to finish starting.
Mar 31 15:57:49 hubserver jupyterhub.service: 2. The server didn't finish starting,
Mar 31 15:57:49 hubserver jupyterhub.service: or it crashed due to a configuration issue.
Mar 31 15:57:49 hubserver jupyterhub.service: Check the single-user server's logs for hints at what needs fixing.
Mar 31 15:57:49 hubserver jupyterhub.service: [D 2022-03-31 15:57:49.144 JupyterHub user:913] Stopping arnau.bria@domain.com
Mar 31 15:57:49 hubserver jupyterhub.service: [D 2022-03-31 15:57:49.145 JupyterHub batchspawner:288] Spawner querying job: sudo -E -u arnau.bria@domain.com squeue -h -j 14270408 -o '%T %B'
Mar 31 15:57:49 hubserver systemd: Created slice User Slice of bria.
Mar 31 15:57:49 hubserver systemd: Starting User Slice of bria.
Mar 31 15:57:49 hubserver systemd: Started Session c2757 of user bria.
Mar 31 15:57:49 hubserver systemd: Starting Session c2757 of user bria.
Mar 31 15:57:49 hubserver systemd: Removed slice User Slice of bria.
Mar 31 15:57:49 hubserver jupyterhub.service: [I 2022-03-31 15:57:49.223 JupyterHub batchspawner:431] Stopping server job 14270408

Job logs:

/apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/bin/jupyterhub-singleuser
[I 2022-03-31 15:53:09.658 SingleUserLabApp mixins:614] Starting jupyterhub single-user server version 2.2.2
[I 2022-03-31 15:53:09.658 SingleUserLabApp mixins:628] Extending jupyterlab.labhubapp.SingleUserLabApp from jupyterlab 3.3.2
[I 2022-03-31 15:53:09.658 SingleUserLabApp mixins:628] Extending jupyter_server.serverapp.ServerApp from jupyter_server 1.15.6
[D 2022-03-31 15:53:09.673 SingleUserLabApp application:174] Searching ['/home/bria', '/apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter', '/home/bria/.jupyter', '/home/bria/.local/etc/jupyter', '/apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2022-03-31 15:53:09.673 SingleUserLabApp application:731] Looking for jupyter_config in /etc/jupyter
[D 2022-03-31 15:53:09.673 SingleUserLabApp application:731] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2022-03-31 15:53:09.673 SingleUserLabApp application:731] Looking for jupyter_config in /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter
[D 2022-03-31 15:53:09.674 SingleUserLabApp application:731] Looking for jupyter_config in /home/bria/.local/etc/jupyter
[D 2022-03-31 15:53:09.674 SingleUserLabApp application:731] Looking for jupyter_config in /home/bria/.jupyter
[D 2022-03-31 15:53:09.674 SingleUserLabApp application:731] Looking for jupyter_config in /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter
[D 2022-03-31 15:53:09.674 SingleUserLabApp application:731] Looking for jupyter_config in /home/bria
[D 2022-03-31 15:53:09.677 SingleUserLabApp application:731] Looking for jupyter_server_config in /etc/jupyter
[D 2022-03-31 15:53:09.677 SingleUserLabApp application:731] Looking for jupyter_server_config in /usr/local/etc/jupyter
[D 2022-03-31 15:53:09.677 SingleUserLabApp application:731] Looking for jupyter_server_config in /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter
[D 2022-03-31 15:53:09.678 SingleUserLabApp application:731] Looking for jupyter_server_config in /home/bria/.local/etc/jupyter
[D 2022-03-31 15:53:09.679 SingleUserLabApp application:731] Looking for jupyter_server_config in /home/bria/.jupyter
[D 2022-03-31 15:53:09.679 SingleUserLabApp application:731] Looking for jupyter_server_config in /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter
[D 2022-03-31 15:53:09.679 SingleUserLabApp application:731] Looking for jupyter_server_config in /home/bria
[D 2022-03-31 15:53:09.683 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /etc/jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.683 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /usr/local/etc/jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.684 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyter_resource_usage.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyterlab.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyterlab_git.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/nbclassic.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/nbdime.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/notebook_shim.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.691 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /home/bria/.local/etc/jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.691 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /home/bria/.jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.692 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyter_resource_usage.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyterlab.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyterlab.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/jupyterlab_git.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/nbclassic.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/nbdime.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.d/notebook_shim.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_server_config.json
[D 2022-03-31 15:53:09.694 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_server_config:
        /home/bria/jupyter_server_config.json
/apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/lib/python3.9/site-packages/jupyter_server_mathjax/app.py:40: FutureWarning: The alias `_()` will be deprecated. Use `_i18n()` instead.
  help=_("""The MathJax.js configuration file that is to be used."""),
[I 2022-03-31 15:53:09.825 SingleUserLabApp manager:345] jupyter_resource_usage | extension was successfully linked.
[D 2022-03-31 15:53:09.832 MathJaxExtension] Config changed: {'FileContentsManager': {'delete_to_trash': False}, 'SingleUserLabApp': {'port': 59640, 'log_level': 'DEBUG'}, 'ExtensionApp': {'log_level': 'DEBUG'}, 'ServerApp': {'jpserver_extensions': <LazyConfigValue {'update': {'jupyter_resource_usage': True, 'jupyter_server_mathjax': True, 'jupyterlab': True, 'jupyterlab_git': True, 'nbclassic': True, 'nbdime': True, 'notebook_shim': True}}>}}
[D 2022-03-31 15:53:09.833 SingleUserLabApp application:326] Config changed: {'FileContentsManager': {'delete_to_trash': False}, 'SingleUserLabApp': {'port': 59640, 'log_level': 'DEBUG'}, 'ExtensionApp': {'log_level': 'DEBUG'}, 'ServerApp': {'jpserver_extensions': <LazyConfigValue value={'jupyter_resource_usage': True, 'jupyter_server_mathjax': True, 'jupyterlab': True, 'jupyterlab_git': True, 'nbclassic': True, 'nbdime': True, 'notebook_shim': True}>}}
[I 2022-03-31 15:53:09.834 SingleUserLabApp manager:345] jupyter_server_mathjax | extension was successfully linked.
[D 2022-03-31 15:53:09.847 LabApp] Config changed: {'NotebookApp': {}, 'ServerApp': {'jpserver_extensions': <LazyConfigValue value={'jupyter_resource_usage': True, 'jupyter_server_mathjax': True, 'jupyterlab': True, 'jupyterlab_git': True, 'nbclassic': True, 'nbdime': True, 'notebook_shim': True}>}, 'FileContentsManager': {'delete_to_trash': False}, 'SingleUserLabApp': {'port': 59640, 'log_level': 'DEBUG'}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[I 2022-03-31 15:53:09.848 SingleUserLabApp manager:345] jupyterlab | extension was successfully linked.
[I 2022-03-31 15:53:09.848 SingleUserLabApp manager:345] jupyterlab_git | extension was successfully linked.
[D 2022-03-31 15:53:09.858 NotebookApp] Config changed: {'NotebookApp': {}, 'ServerApp': {'jpserver_extensions': <LazyConfigValue value={'jupyter_resource_usage': True, 'jupyter_server_mathjax': True, 'jupyterlab': True, 'jupyterlab_git': True, 'nbclassic': True, 'nbdime': True, 'notebook_shim': True}>}, 'FileContentsManager': {'delete_to_trash': False}, 'SingleUserLabApp': {'port': 59640, 'log_level': 'DEBUG'}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[I 2022-03-31 15:53:09.859 SingleUserLabApp manager:345] nbclassic | extension was successfully linked.
[I 2022-03-31 15:53:09.859 SingleUserLabApp manager:345] nbdime | extension was successfully linked.
[D 2022-03-31 15:53:10.578 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /home/bria/.jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.578 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /etc/jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.579 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.580 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyter_resource_usage.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/nbdime.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.584 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.584 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /home/bria/.local/etc/jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.584 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /home/bria/.jupyter/jupyter_notebook_config.json
[D 2022-03-31 15:53:10.585 SingleUserLabApp config_manager:97] Paths used for configuration of jupyter_notebook_config:
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyter_resource_usage.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.d/nbdime.json
        /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/etc/jupyter/jupyter_notebook_config.json
[I 2022-03-31 15:53:10.586 SingleUserLabApp manager:345] notebook_shim | extension was successfully linked.
[I 2022-03-31 15:53:10.637 SingleUserLabApp manager:367] notebook_shim | extension was successfully loaded.
[I 2022-03-31 15:53:10.638 SingleUserLabApp manager:367] jupyter_resource_usage | extension was successfully loaded.
[I 2022-03-31 15:53:10.639 SingleUserLabApp manager:367] jupyter_server_mathjax | extension was successfully loaded.
[I 2022-03-31 15:53:10.640 LabApp] JupyterLab extension loaded from /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/lib/python3.9/site-packages/jupyterlab
[I 2022-03-31 15:53:10.640 LabApp] JupyterLab application directory is /apps/prod/easybuild/sl7.x86_64.foss-2021a/software/Python/3.9.5-GCCcore-10.3.0-jupyter/share/jupyter/lab
[I 2022-03-31 15:53:10.644 SingleUserLabApp manager:367] jupyterlab | extension was successfully loaded.
[I 2022-03-31 15:53:10.650 SingleUserLabApp manager:367] jupyterlab_git | extension was successfully loaded.
[I 2022-03-31 15:53:10.665 SingleUserLabApp manager:367] nbclassic | extension was successfully loaded.
[D 2022-03-31 15:53:10.839 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[D 2022-03-31 15:53:10.840 SingleUserLabApp loader:499] Using default logger
[I 2022-03-31 15:53:10.843 SingleUserLabApp manager:367] nbdime | extension was successfully loaded.
[I 2022-03-31 15:53:10.843 SingleUserLabApp mixins:640] Starting jupyterhub-singleuser server version 2.2.2
[D 2022-03-31 15:53:10.848 SingleUserLabApp _version:74] jupyterhub and jupyterhub-singleuser both on version 2.2.2
[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] Serving notebooks from local directory: /home/bria
[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] Jupyter Server 1.15.6 is running at:
[I 2022-03-31 15:53:10.848 SingleUserLabApp serverapp:2674] http://computenode.eu.domain.com:59640/user/arnau.bria@domain.com/lab
[I 2022-03-31 15:53:10.849 SingleUserLabApp serverapp:2674]  or http://127.0.0.1:59640/user/arnau.bria@domain.com/lab
[I 2022-03-31 15:53:10.849 SingleUserLabApp serverapp:2675] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2022-03-31 15:53:10.857 SingleUserLabApp mixins:596] Updating Hub with activity every 300 seconds
[D 2022-03-31 15:53:10.857 SingleUserLabApp mixins:558] Notifying Hub of activity 2022-03-31T13:53:10.600560Z
srun: Job step aborted: Waiting up to 12 seconds for job step to finish.
slurmstepd: error: *** STEP 14270408.0 ON computenode CANCELLED AT 2022-03-31T15:57:49 ***
slurmstepd: error: *** JOB 14270408 ON computenode CANCELLED AT 2022-03-31T15:57:49 ***
[C 2022-03-31 15:57:49.277 SingleUserLabApp serverapp:2124] received signal 15, stopping
[I 2022-03-31 15:57:49.302 SingleUserLabApp serverapp:2463] Shutting down 7 extensions

davidedelvento commented 2 years ago

Were you able to eventually solve it? If so, how?

pelacables commented 2 years ago

no :-(

davidedelvento commented 2 years ago

Thanks for letting me know.

In the half day since I asked, I found that the problem for me was that the batch nodes were trying to connect to the hub on the batch nodes themselves, rather than on the actual server where it is running. Setting the correct IP with c.JupyterHub.hub_ip (and port) and making sure it was reachable from the batch nodes solved the issue.

pelacables commented 2 years ago

what is the value, then? cause in my case it was the DNS alias without port/protocol. Maybe that's my issue, too.

davidedelvento commented 2 years ago

I don't have the port/protocol there either. In my case the hub_ip is simply the hostname in the same way that I would use to ssh there, namely co35svhead01

As a separate setting, I have c.JupyterHub.port = 443 and I don't repeat that in the hub_ip setting or anywhere else.

I suggest you try a separate install/configuration starting from zero and adding things one at the time. I found that often stuff that you think is unrelated instead has unintended consequences, so adding one thing at the time and trying helped me.

pelacables commented 2 years ago

I have added the port but it still does not work. Would you mind sharing your conf with me, please?

davidedelvento commented 2 years ago

Sure. I will do that later today.

davidedelvento commented 2 years ago

This is what I have

c.JupyterHub.port = 443
c.JupyterHub.proxy_class = 'jupyterhub_traefik_proxy.TraefikTomlProxy'
c.JupyterHub.spawner_class = 'batchspawner.SlurmSpawner'
import batchspawner
c.SlurmSpawner.req_prologue = """
hostname
source /home/sw/modules.sh
spack env activate JH
"""
c.JupyterHub.hub_ip = "co35svhead01"
c.JupyterHub.ssl_cert = 'cert.pem'
c.JupyterHub.ssl_key = 'key.pem'
c.Spawner.default_url = '/lab'

Note the apparently useless import batchspawner per their instructions

pelacables commented 2 years ago

Thanks. Seems quute like mine, maybe some fw... i hate proxies.

El dv., 29 de jul. 2022, 21:11, Davide @.***> va escriure:

This is what I have

c.JupyterHub.port = 443 c.JupyterHub.proxy_class = 'jupyterhub_traefik_proxy.TraefikTomlProxy' c.JupyterHub.spawner_class = 'batchspawner.SlurmSpawner' import batchspawner c.SlurmSpawner.req_prologue = """ hostname source /home/sw/modules.sh spack env activate JH """ c.JupyterHub.hub_ip = "co35svhead01" c.JupyterHub.ssl_cert = 'cert.pem' c.JupyterHub.ssl_key = 'key.pem' c.Spawner.default_url = '/lab'

Note the apparently useless import batchspawner per their instructions

— Reply to this email directly, view it on GitHub https://github.com/jupyterhub/batchspawner/issues/233#issuecomment-1199866684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYX2LS2FSUBKLE2HAUEABLVWQUEXANCNFSM5SFROYAA . You are receiving this because you authored the thread.Message ID: @.***>

davidedelvento commented 2 years ago

Yes, it's a lot of moving parts and dependencies... I used traefik because I could not get node.js to work for unrelated issues.....

kennedydane commented 2 years ago

@pelacables Did you get yours working? I'm having the same issue. worker notifies hub, but then nothing…

pelacables commented 2 years ago

Unfortunately not. Let me know if you manage to solve the issue, please.

j-danek commented 1 year ago

Hi, I believe to have encounter the same issue. I cannot make the process jupyterhub-singleuser to connect to the JupyterHub after it has been successfully spawned and is running. The batch process just times out and peacefully stops. [I 2022-11-29 14:42:20.232 SingleUserNotebookApp notebookapp:2327] Jupyter Notebook 6.4.12 is running at: [I 2022-11-29 14:42:20.232 SingleUserNotebookApp notebookapp:2327] http://REDACTED_URL:40331/user/ubuntu [I 2022-11-29 14:42:20.232 SingleUserNotebookApp notebookapp:2328] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

I tried to connect to the Notebook directly in browser with the address given above just to be redirected to address http://REDACTED_URL:40331/hub/api/oauth2/authorize?client_id=... and with error 404 : Not Found You are requesting a page that does not exist! I believe that the Notebook cannot connect to JupyterHub for authentication.

I noticed that when the batch job is running, a connection to another randomly generated port appears. And the two ports do not match. $ sudo lsof -i batchspaw 113935 ubuntu 10u IPv4 2237630 0t0 TCP *:40331 (LISTEN) jupyterhu 113257 root 9u IPv4 2225954 0t0 TCP *:8000 (LISTEN) jupyterhu 113257 root 10u IPv4 2227193 0t0 TCP REDACTED_NAME:8000->REDACTED_NAME:55474 (ESTABLISHED)

Could this be the reason why the services do not communicate?

pelacables commented 1 year ago

I also see the 404, but what I do not see is the serevr talking to the client. That's why I did not pay attention to the 404. to me it looks like the server is, for some reason, not making the connection to the node endpoint at all.

but at this point I'm not sure about anything :-)

dstndstn commented 1 year ago

I was getting this same issue and https://github.com/jupyterhub/batchspawner/pull/251 fixed it for me

pelacables commented 1 year ago

thanks ! that fixed the issue for me, too.