2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
105 stars 64 forks source link

Debug intermittent failure of servers to start in LEAP hub #2047

Closed yuvipanda closed 4 months ago

yuvipanda commented 1 year ago

In https://2i2c.freshdesk.com/a/tickets/365, it was reported that there was failure to start on some LEAP hub users intermittently. Looking at the hub logs, I discovered a number of:

[W 2023-01-11 19:22:05.428 JupyterHub user:881] <user> server never showed up at http://10.0.81.5:8888/user/<user>/ after 30 seconds. Giving up.

    Common causes of this timeout, and debugging tips:

    1. The server didn't finish starting,
       or it crashed due to a configuration issue.
       Check the single-user server's logs for hints at what needs fixing.
    2. The server started, but is not accessible at the specified URL.
       This may be a configuration issue specific to your chosen Spawner.
       Check the single-user server logs and resource to make sure the URL
       is correct and accessible from the Hub.
    3. (unlikely) Everything is working, but the server took too long to respond.
       To fix: increase `Spawner.http_timeout` configuration
       to a number of seconds that is enough for servers to become responsive.

[I 2023-01-11 19:22:05.429 JupyterHub spawner:2808] Deleting pod prod/jupyter-<user>

at relevant times.

Upon looking at the user's pod logs (using techniques described in #2046), I see at the matching timestamp:

[C 2023-01-11 19:22:05.491 SingleUserNotebookApp notebookapp:1971] received signal 15, stopping

This means the pod started, but notebook process inside the pod hadn't started in time, and the 'Deleting pod' sent the appropriate signal to the user pod to shut down. So this isn't a spawn timeout but a http timeout.

In the user logs, the line preceeding the 'received signal 15' is:

[I 2023-01-11 21:36:56.540 LabApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab

If I look at the logs from a successful startup, I see:

[I 2023-01-11 22:15:01.495 SingleUserNotebookApp mixins:609] Starting jupyterhub single-user server version 3.1.0
[I 2023-01-11 22:15:01.495 SingleUserNotebookApp mixins:623] Extending notebook.notebookapp.NotebookApp from notebook 6.5.2
[I 2023-01-11 22:15:03.047 SingleUserNotebookApp manager:91] [nb_conda_kernels] enabled, 0 kernels found
[I 2023-01-11 22:15:03.155 SingleUserNotebookApp notebookapp:1592] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2023-01-11 22:15:10.165 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-11 22:15:10.169 LabApp] 'extra_template_paths' was found in both NotebookApp and ServerApp. This is likely a recent change. This config will only be set in NotebookApp. Please check if you should also config these traits in ServerApp for your purpose.
[W 2023-01-11 22:15:10.169 LabApp] 'extra_template_paths' was found in both NotebookApp and ServerApp. This is likely a recent change. This config will only be set in NotebookApp. Please check if you should also config these traits in ServerApp for your purpose.
[W 2023-01-11 22:15:10.171 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-11 22:15:10.181 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[I 2023-01-11 22:15:10.183 LabApp] JupyterLab extension loaded from /srv/conda/envs/notebook/lib/python3.10/site-packages/jupyterlab
[I 2023-01-11 22:15:10.184 LabApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab
[I 2023-01-11 22:15:22.292 SingleUserNotebookApp mixins:670] Starting jupyterhub-singleuser server version 3.1.0
[W 2023-01-11 22:15:22.304 SingleUserNotebookApp _version:68] jupyterhub version 3.0.0 != jupyterhub-singleuser version 3.1.0. This could cause failure to authenticate and result in redirect loops!
[I 2023-01-11 22:15:22.304 SingleUserNotebookApp notebookapp:2328] Serving notebooks from local directory: /home/jovyan
[I 2023-01-11 22:15:22.304 SingleUserNotebookApp notebookapp:2328] Jupyter Notebook 6.5.2 is running at:

So right after:

[I 2023-01-11 22:15:10.184 LabApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab

you see

[I 2023-01-11 22:15:22.292 SingleUserNotebookApp mixins:670] Starting jupyterhub-singleuser server version 3.1.0

However, it took 12 seconds to arrive! That is a very long time for something to be happening here, and the notebook server is not 'ready' until well after that. It is quite likely that sometimes, it takes longer than 30s and hence the timeout happens.

Ideally, all this should really be taking under a second so it is unclear why things are taking as long as they are. My suspicion is there's a notebook extension that's doing some blocking code execution on startup, causing this - but not sure how exactly to look for that.

Items to resolve support ticket

Items to resolve to close this

yuvipanda commented 1 year ago

https://github.com/2i2c-org/infrastructure/pull/2048 will temporarily increase the start timeout to make launches more robust, as we try to figure out why in the world it takes so long to start up

yuvipanda commented 1 year ago

Here's debug logs for the user pod:

[I 2023-01-12 00:45:14.726 SingleUserNotebookApp mixins:609] Starting jupyterhub single-user server version 3.1.0
[I 2023-01-12 00:45:14.726 SingleUserNotebookApp mixins:623] Extending notebook.notebookapp.NotebookApp from notebook 6.5.2
[D 2023-01-12 00:45:14.740 SingleUserNotebookApp application:190] Searching ['/srv/conda/envs/notebook/etc/jupyter', '/home/jovyan/.jupyter', '/home/jovyan/.local/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2023-01-12 00:45:14.741 SingleUserNotebookApp application:900] Looking for jupyter_config in /etc/jupyter
[D 2023-01-12 00:45:14.742 SingleUserNotebookApp application:900] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2023-01-12 00:45:14.742 SingleUserNotebookApp application:900] Looking for jupyter_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 00:45:14.743 SingleUserNotebookApp application:900] Looking for jupyter_config in /home/jovyan/.jupyter
[D 2023-01-12 00:45:14.746 SingleUserNotebookApp application:900] Looking for jupyter_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 00:45:14.747 SingleUserNotebookApp application:921] Loaded config file: /srv/conda/envs/notebook/etc/jupyter/jupyter_config.json
[D 2023-01-12 00:45:14.755 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /etc/jupyter
[D 2023-01-12 00:45:14.755 SingleUserNotebookApp application:921] Loaded config file: /etc/jupyter/jupyter_notebook_config.py
[D 2023-01-12 00:45:14.756 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /usr/local/etc/jupyter
[D 2023-01-12 00:45:14.756 SingleUserNotebookApp application:921] Loaded config file: /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:45:14.757 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 00:45:14.757 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /home/jovyan/.jupyter
[D 2023-01-12 00:45:14.759 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 00:45:14.762 SingleUserNotebookApp manager:121] [nb_conda_kernels] refreshing conda info
[I 2023-01-12 00:45:15.792 SingleUserNotebookApp manager:91] [nb_conda_kernels] enabled, 0 kernels found
[D 2023-01-12 00:45:15.839 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:45:15.840 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:45:15.841 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:45:15.842 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:45:15.844 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/dask_labextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter-server-proxy-notebookserverextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter_resource_usage.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_s3_browser.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbdime.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbgitpuller.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/panel-client-jupyter.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/xarray_leaflet.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.json
[I 2023-01-12 00:45:15.861 SingleUserNotebookApp notebookapp:1592] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2023-01-12 00:45:20.011 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-12 00:45:20.015 LabApp] 'extra_template_paths' was found in both NotebookApp and ServerApp. This is likely a recent change. This config will only be set in NotebookApp. Please check if you should also config these traits in ServerApp for your purpose.
[W 2023-01-12 00:45:20.015 LabApp] 'extra_template_paths' was found in both NotebookApp and ServerApp. This is likely a recent change. This config will only be set in NotebookApp. Please check if you should also config these traits in ServerApp for your purpose.
[W 2023-01-12 00:45:20.019 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-12 00:45:20.025 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[I 2023-01-12 00:45:20.027 LabApp] JupyterLab extension loaded from /srv/conda/envs/notebook/lib/python3.10/site-packages/jupyterlab
[I 2023-01-12 00:45:20.028 LabApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab
[D 2023-01-12 00:45:21.066 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 00:45:21.066 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 00:45:21.067 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 00:45:21.067 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 00:45:21.068 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 00:45:21.069 SingleUserNotebookApp loader:511] Using default logger
[I 2023-01-12 00:45:26.937 SingleUserNotebookApp mixins:670] Starting jupyterhub-singleuser server version 3.1.0
[W 2023-01-12 00:45:26.947 SingleUserNotebookApp _version:68] jupyterhub version 3.0.0 != jupyterhub-singleuser version 3.1.0. This could cause failure to authenticate and result in redirect loops!
[I 2023-01-12 00:45:26.949 SingleUserNotebookApp notebookapp:2328] Serving notebooks from local directory: /home/jovyan
[I 2023-01-12 00:45:26.949 SingleUserNotebookApp notebookapp:2328] Jupyter Notebook 6.5.2 is running at:
[I 2023-01-12 00:45:26.949 SingleUserNotebookApp notebookapp:2328] http://jupyter-yuvipanda:8888/user/yuvipanda/
[I 2023-01-12 00:45:26.950 SingleUserNotebookApp notebookapp:2329] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Not much of use here

yuvipanda commented 1 year ago

Here's the log output if I switch to using Jupyter Server instead of Notebook Server:

[I 2023-01-12 00:52:05.816 SingleUserLabApp mixins:609] Starting jupyterhub single-user server version 3.1.0
[I 2023-01-12 00:52:05.817 SingleUserLabApp mixins:623] Extending jupyterlab.labhubapp.SingleUserLabApp from jupyterlab 3.5.1
[I 2023-01-12 00:52:05.817 SingleUserLabApp mixins:623] Extending jupyter_server.serverapp.ServerApp from jupyter_server 1.23.4
[D 2023-01-12 00:52:05.827 SingleUserLabApp application:190] Searching ['/srv/conda/envs/notebook/etc/jupyter', '/home/jovyan/.jupyter', '/home/jovyan/.local/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2023-01-12 00:52:05.828 SingleUserLabApp application:900] Looking for jupyter_config in /etc/jupyter
[D 2023-01-12 00:52:05.828 SingleUserLabApp application:900] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2023-01-12 00:52:05.828 SingleUserLabApp application:900] Looking for jupyter_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 00:52:05.831 SingleUserLabApp application:900] Looking for jupyter_config in /home/jovyan/.jupyter
[D 2023-01-12 00:52:05.833 SingleUserLabApp application:900] Looking for jupyter_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 00:52:05.834 SingleUserLabApp application:921] Loaded config file: /srv/conda/envs/notebook/etc/jupyter/jupyter_config.json
[D 2023-01-12 00:52:05.842 SingleUserLabApp application:900] Looking for jupyter_server_config in /etc/jupyter
[D 2023-01-12 00:52:05.842 SingleUserLabApp application:900] Looking for jupyter_server_config in /usr/local/etc/jupyter
[D 2023-01-12 00:52:05.843 SingleUserLabApp application:900] Looking for jupyter_server_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 00:52:05.843 SingleUserLabApp application:900] Looking for jupyter_server_config in /home/jovyan/.jupyter
[D 2023-01-12 00:52:05.845 SingleUserLabApp application:900] Looking for jupyter_server_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 00:52:05.848 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_server_config: 
        /etc/jupyter/jupyter_server_config.json
[D 2023-01-12 00:52:05.848 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_server_config: 
        /usr/local/etc/jupyter/jupyter_server_config.json
[D 2023-01-12 00:52:05.848 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_server_config: 
        /home/jovyan/.local/etc/jupyter/jupyter_server_config.json
[D 2023-01-12 00:52:05.849 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_server_config: 
        /home/jovyan/.jupyter/jupyter_server_config.json
[D 2023-01-12 00:52:05.852 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_server_config: 
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/dask_labextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter-server-proxy-jupyterserverextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter_resource_usage.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab_git.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab_s3_browser.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbclassic.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbdime.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbgitpuller.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/notebook_shim.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/panel-client-jupyter.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/xarray_leaflet.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.json

[I 2023-01-12 00:52:15.352 SingleUserLabApp manager:344] dask_labextension | extension was successfully linked.
[I 2023-01-12 00:52:15.353 SingleUserLabApp manager:344] jupyter_resource_usage | extension was successfully linked.
[W 2023-01-12 00:52:15.359 MathJaxExtension] Config option `kernel_spec_manager_class` not recognized by `MathJaxExtension`.
[W 2023-01-12 00:52:15.363 MathJaxExtension] Config option `kernel_spec_manager_class` not recognized by `MathJaxExtension`.
[W 2023-01-12 00:52:15.367 MathJaxExtension] Config option `kernel_spec_manager_class` not recognized by `MathJaxExtension`.
[D 2023-01-12 00:52:15.369 SingleUserLabApp application:457] Config changed: {'FileContentsManager': {'delete_to_trash': False}, 'SingleUserLabApp': {'default_url': '/lab'}, 'JupyterApp': {'kernel_spec_manager_class': 'nb_conda_kernels.CondaKernelSpecManager'}, 'ServerApp': {'jpserver_extensions': <LazyConfigValue value={'dask_labextension': True, 'jupyter_resource_usage': True, 'jupyter_server_mathjax': True, 'jupyter_server_proxy': True, 'jupyterlab': True, 'jupyterlab_git': True, 'jupyterlab_s3_browser': True, 'nbclassic': True, 'nbdime': True, 'nbgitpuller': True, 'notebook_shim': True, 'panel.io.jupyter_server_extension': True, 'xarray_leaflet': True}>}}
[I 2023-01-12 00:52:15.370 SingleUserLabApp manager:344] jupyter_server_mathjax | extension was successfully linked.
[I 2023-01-12 00:52:15.370 SingleUserLabApp manager:344] jupyter_server_proxy | extension was successfully linked.
[W 2023-01-12 00:52:15.375 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-12 00:52:15.379 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-01-12 00:52:15.385 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[I 2023-01-12 00:52:15.386 SingleUserLabApp manager:344] jupyterlab | extension was successfully linked.
[I 2023-01-12 00:52:15.386 SingleUserLabApp manager:344] jupyterlab_git | extension was successfully linked.
[I 2023-01-12 00:52:15.387 SingleUserLabApp manager:344] jupyterlab_s3_browser | extension was successfully linked.
[W 2023-01-12 00:52:15.392 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[W 2023-01-12 00:52:15.396 NotebookApp] 'extra_template_paths' was found in both NotebookApp and ServerApp. This is likely a recent change. This config will only be set in NotebookApp. Please check if you should also config these traits in ServerApp for your purpose.
[W 2023-01-12 00:52:15.398 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[W 2023-01-12 00:52:15.402 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[I 2023-01-12 00:52:15.403 SingleUserLabApp manager:344] nbclassic | extension was successfully linked.
[I 2023-01-12 00:52:15.404 SingleUserLabApp manager:344] nbdime | extension was successfully linked.
[I 2023-01-12 00:52:15.404 SingleUserLabApp manager:344] nbgitpuller | extension was successfully linked.
[D 2023-01-12 00:52:15.416 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:52:15.416 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:52:15.417 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:52:15.418 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:52:15.418 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.jupyter/jupyter_notebook_config.json
[D 2023-01-12 00:52:15.419 SingleUserLabApp config_manager:93] Paths used for configuration of jupyter_notebook_config: 
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/dask_labextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter-server-proxy-notebookserverextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter_resource_usage.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_s3_browser.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbdime.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbgitpuller.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/panel-client-jupyter.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/xarray_leaflet.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.json
[I 2023-01-12 00:52:15.425 SingleUserLabApp manager:344] notebook_shim | extension was successfully linked.
[I 2023-01-12 00:52:15.425 SingleUserLabApp manager:344] panel.io.jupyter_server_extension | extension was successfully linked.
[I 2023-01-12 00:52:15.425 SingleUserLabApp manager:344] xarray_leaflet | extension was successfully linked.
[D 2023-01-12 00:52:15.428 SingleUserLabApp manager:121] [nb_conda_kernels] refreshing conda info
[I 2023-01-12 00:52:16.436 SingleUserLabApp manager:91] [nb_conda_kernels] enabled, 0 kernels found
[I 2023-01-12 00:52:16.539 SingleUserLabApp manager:362] notebook_shim | extension was successfully loaded.
[I 2023-01-12 00:52:16.542 SingleUserLabApp manager:362] dask_labextension | extension was successfully loaded.
[I 2023-01-12 00:52:16.544 SingleUserLabApp manager:362] jupyter_resource_usage | extension was successfully loaded.
[I 2023-01-12 00:52:16.546 SingleUserLabApp manager:362] jupyter_server_mathjax | extension was successfully loaded.
[I 2023-01-12 00:52:17.189 SingleUserLabApp manager:362] jupyter_server_proxy | extension was successfully loaded.
[I 2023-01-12 00:52:17.191 LabApp] JupyterLab extension loaded from /srv/conda/envs/notebook/lib/python3.10/site-packages/jupyterlab
[I 2023-01-12 00:52:17.191 LabApp] JupyterLab application directory is /srv/conda/envs/notebook/share/jupyter/lab
[I 2023-01-12 00:52:17.197 SingleUserLabApp manager:362] jupyterlab | extension was successfully loaded.
[I 2023-01-12 00:52:17.206 SingleUserLabApp manager:362] jupyterlab_git | extension was successfully loaded.
[I 2023-01-12 00:52:17.206 SingleUserLabApp manager:362] jupyterlab_s3_browser | extension was successfully loaded.
[I 2023-01-12 00:52:17.225 SingleUserLabApp manager:362] nbclassic | extension was successfully loaded.
[D 2023-01-12 00:52:17.468 SingleUserLabApp loader:511] Using default logger
[D 2023-01-12 00:52:17.468 SingleUserLabApp loader:511] Using default logger
[D 2023-01-12 00:52:17.468 SingleUserLabApp loader:511] Using default logger
[D 2023-01-12 00:52:17.469 SingleUserLabApp loader:511] Using default logger
[D 2023-01-12 00:52:17.470 SingleUserLabApp loader:511] Using default logger
[D 2023-01-12 00:52:17.471 SingleUserLabApp loader:511] Using default logger
[I 2023-01-12 00:52:17.474 SingleUserLabApp manager:362] nbdime | extension was successfully loaded.
[I 2023-01-12 00:52:17.476 SingleUserLabApp manager:362] nbgitpuller | extension was successfully loaded.
[I 2023-01-12 00:52:17.479 SingleUserLabApp manager:362] panel.io.jupyter_server_extension | extension was successfully loaded.
[I 2023-01-12 00:52:17.480 SingleUserLabApp manager:362] xarray_leaflet | extension was successfully loaded.
[I 2023-01-12 00:52:17.482 SingleUserLabApp mixins:670] Starting jupyterhub-singleuser server version 3.1.0
[W 2023-01-12 00:52:17.495 SingleUserLabApp _version:68] jupyterhub version 3.0.0 != jupyterhub-singleuser version 3.1.0. This could cause failure to authenticate and result in redirect loops!
[I 2023-01-12 00:52:17.495 SingleUserLabApp serverapp:2729] Serving notebooks from local directory: /home/jovyan
[I 2023-01-12 00:52:17.495 SingleUserLabApp serverapp:2729] Jupyter Server 1.23.4 is running at:
[I 2023-01-12 00:52:17.495 SingleUserLabApp serverapp:2729] http://jupyter-yuvipanda:8888/user/yuvipanda/lab
[I 2023-01-12 00:52:17.496 SingleUserLabApp serverapp:2729]  or http://127.0.0.1:8888/user/yuvipanda/lab

This is both better and worse

yuvipanda commented 1 year ago

Hmm, everything seems to happen twice here, which is confusing.

yuvipanda commented 1 year ago

I just tried the same on a different hub without the pangeo image, and it started up in under a second. So something is screwy with some extension in use.

yuvipanda commented 1 year ago

My intuition was correct - it was an extension being screwy. I discovered this by disabling all extensions and enabling them one by one. And turns out the extension that is causing the blocking is... jupyterlab?!

[I 2023-01-12 01:57:59.576 SingleUserNotebookApp mixins:609] Starting jupyterhub single-user server version 3.1.0
[I 2023-01-12 01:57:59.577 SingleUserNotebookApp mixins:623] Extending notebook.notebookapp.NotebookApp from notebook 6.5.2
[D 2023-01-12 01:57:59.590 SingleUserNotebookApp application:190] Searching ['/srv/conda/envs/notebook/etc/jupyter', '/home/jovyan/.jupyter', '/home/jovyan/.local/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2023-01-12 01:57:59.591 SingleUserNotebookApp application:900] Looking for jupyter_config in /etc/jupyter
[D 2023-01-12 01:57:59.591 SingleUserNotebookApp application:900] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2023-01-12 01:57:59.592 SingleUserNotebookApp application:900] Looking for jupyter_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 01:57:59.593 SingleUserNotebookApp application:900] Looking for jupyter_config in /home/jovyan/.jupyter
[D 2023-01-12 01:57:59.594 SingleUserNotebookApp application:900] Looking for jupyter_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 01:57:59.594 SingleUserNotebookApp application:921] Loaded config file: /srv/conda/envs/notebook/etc/jupyter/jupyter_config.json
[D 2023-01-12 01:57:59.601 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /etc/jupyter
[D 2023-01-12 01:57:59.602 SingleUserNotebookApp application:921] Loaded config file: /etc/jupyter/jupyter_notebook_config.py
[D 2023-01-12 01:57:59.602 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /usr/local/etc/jupyter
[D 2023-01-12 01:57:59.603 SingleUserNotebookApp application:921] Loaded config file: /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:57:59.603 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /home/jovyan/.local/etc/jupyter
[D 2023-01-12 01:57:59.604 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /home/jovyan/.jupyter
[D 2023-01-12 01:57:59.607 SingleUserNotebookApp application:921] Loaded config file: /home/jovyan/.jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:57:59.607 SingleUserNotebookApp application:900] Looking for jupyter_notebook_config in /srv/conda/envs/notebook/etc/jupyter
[D 2023-01-12 01:57:59.609 SingleUserNotebookApp manager:121] [nb_conda_kernels] refreshing conda info
[I 2023-01-12 01:58:00.648 SingleUserNotebookApp manager:91] [nb_conda_kernels] enabled, 0 kernels found
[D 2023-01-12 01:58:00.694 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:58:00.695 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:58:00.696 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.local/etc/jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:58:00.697 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /home/jovyan/.jupyter/jupyter_notebook_config.json
[D 2023-01-12 01:58:00.698 SingleUserNotebookApp config_manager:95] Paths used for configuration of jupyter_notebook_config: 
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/dask_labextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter-server-proxy-notebookserverextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyter_resource_usage.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/jupyterlab_s3_browser.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbdime.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/nbgitpuller.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/panel-client-jupyter.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.d/xarray_leaflet.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_notebook_config.json
[I 2023-01-12 01:58:00.712 SingleUserNotebookApp notebookapp:1592] Authentication of /metrics is OFF, since other authentication is disabled.
[D 2023-01-12 01:58:04.543 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 01:58:04.543 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 01:58:04.544 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 01:58:04.544 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 01:58:04.545 SingleUserNotebookApp loader:511] Using default logger
[D 2023-01-12 01:58:04.546 SingleUserNotebookApp loader:511] Using default logger
[I 2023-01-12 01:58:10.769 SingleUserNotebookApp mixins:670] Starting jupyterhub-singleuser server version 3.1.0
[W 2023-01-12 01:58:10.779 SingleUserNotebookApp _version:68] jupyterhub version 3.0.0 != jupyterhub-singleuser version 3.1.0. This could cause failure to authenticate and result in redirect loops!
[I 2023-01-12 01:58:10.779 SingleUserNotebookApp notebookapp:2328] Serving notebooks from local directory: /home/jovyan
[I 2023-01-12 01:58:10.780 SingleUserNotebookApp notebookapp:2328] Jupyter Notebook 6.5.2 is running at:
[I 2023-01-12 01:58:10.780 SingleUserNotebookApp notebookapp:2328] http://jupyter-yuvipanda:8888/user/yuvipanda/
[I 2023-01-12 01:58:10.780 SingleUserNotebookApp notebookapp:2329] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

You'll see this starts under a second as expected.

    dask_labextension  enabled
    - Validating...
      dask_labextension 6.0.0 OK
    jupyter_server_proxy  enabled
    - Validating...
      jupyter_server_proxy  OK
    jupyter_resource_usage  enabled
    - Validating...
      jupyter_resource_usage 0.6.4 OK
    jupyterlab disabled
    - Validating...
      jupyterlab 3.5.1 OK
    jupyterlab_git  enabled
    - Validating...
      jupyterlab_git 0.41.0 OK
    jupyterlab_s3_browser  enabled
    - Validating...
      jupyterlab_s3_browser  OK
    nbdime  enabled
    - Validating...
      nbdime 3.1.1 OK
    nbgitpuller  enabled
    - Validating...
      nbgitpuller 1.1.1 OK
    panel.io.jupyter_server_extension  enabled
    - Validating...
      panel.io.jupyter_server_extension  OK
    xarray_leaflet  enabled
    - Validating...
      xarray_leaflet 0.1.16 OK

sigh.

yuvipanda commented 1 year ago

I'm unfortunately unable to reproduce this locally, with the same version of jupyterlab, and the following jupyterhub_config.py config:

c.JupyterHub.authenticator_class = 'dummy'
c.JupyterHub.spawner_class = 'simple'
c.Spawner.debug = True

:(

yuvipanda commented 1 year ago

So I took a profiler to it, and look what I found:

jupyter-2023-01-12T02 21 49Z

yuvipanda commented 1 year ago

Something is calling conda info --json, which takes forever - and blocks all server startup until then. This laid out the culprit immediately to https://github.com/Anaconda-Platform/nb_conda_kernels/, and the culprit blocking call is at https://github.com/Anaconda-Platform/nb_conda_kernels/blob/f461d2159a970fec323f0c225557e74c63c6a45b/nb_conda_kernels/manager.py#L131 - this subprocess call blocks the startup of the server until the fairly slow conda info returns!

yuvipanda commented 1 year ago

And bam, here's the issue raised in the nb_conda_kernels repo: https://github.com/Anaconda-Platform/nb_conda_kernels/issues/210 and a possible fix that's been sitting idle https://github.com/Anaconda-Platform/nb_conda_kernels/pull/211

yuvipanda commented 1 year ago

I uninstalled nb_conda_kernels, but unfortunately it still takes a while for some reason.

jupyter-2023-01-12T02 45 23Z

yuvipanda commented 1 year ago

Ok, my prior conclusion was wrong, and while the issue should be fixed in nb_conda_kernels too, I don't know that is what the problem here is. I have an strace of jupyter lab --debug starting, and there's no call to conda - yet there's a multi second gap. And running jupyter notebook --debug does not cause this problem at all...

yuvipanda commented 1 year ago

ok, so a different conclusion I've come to now, based on looking at strace output, is that jupyter server will load all extensions regardless of their enabled / disabled status, while notebook server will only load enabled extensions? This means the ton of imports from the installed notebook extensions slow down startup immensely.

yuvipanda commented 1 year ago

So I modified the code that is https://github.com/jupyter-server/jupyter_server/blob/a52709cdaff55eeb36774d3e60552b3535786e06/jupyter_server/extension/utils.py#L65 inside the container (with vim /srv/conda/envs/notebook/lib/python3.10/site-packages/jupyter_server/extension/utils.py), and added the following code:

    start = time.perf_counter()
    module = importlib.import_module(package_name)
    end = time.perf_counter()
    duration = end - start
    print(f"Importing {package_name} took {duration:0.2f}s", flush=True)

at the location indicated in the first link, to measure how long importing each server extension takes.

With all server extensions disabled, here is the appropriate output:

Importing jupyterlab took 0.00s
Importing dask_labextension took 0.56s
Importing jupyter_resource_usage took 0.00s
Importing jupyter_server_mathjax took 0.00s
Importing jupyter_server_proxy took 0.00s
Importing jupyterlab_git took 0.04s
Importing jupyterlab_s3_browser took 0.19s
Importing nbclassic took 0.00s
Importing nbdime took 0.00s
Importing nbgitpuller took 0.00s
Importing notebook_shim took 0.00s
Importing panel.io.jupyter_server_extension took 1.84s
Importing xarray_leaflet took 1.14s

so we have our two culprits - xarray_leaflet and panel jupyter server extension :) And jupyter server itself is a culprit, where disabling extensions does not prevent their import and initialization!

yuvipanda commented 1 year ago

With all the extensions enabled, there wasn't much of a difference in startup time. However, with all .pyc files deleted (with find /srv/conda/envs/notebook/lib/python3.10/site-packages/ -name '*.pyc' -delete) to simulate how things are at first startup:

[D 2023-01-12 04:13:58.115 ServerApp] Paths used for configuration of jupyter_server_config:
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/dask_labextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter-server-proxy-jupyterserverextension.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter_resource_usage.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab_git.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/jupyterlab_s3_browser.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbclassic.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbdime.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/nbgitpuller.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/notebook_shim.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/panel-client-jupyter.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.d/xarray_leaflet.json
        /srv/conda/envs/notebook/etc/jupyter/jupyter_server_config.json
Importing jupyterlab took 0.00s
Importing dask_labextension took 1.51s
Importing jupyter_resource_usage took 0.01s
Importing jupyter_server_mathjax took 0.00s
Importing jupyter_server_proxy took 0.00s
Importing jupyterlab_git took 0.13s
Importing jupyterlab_s3_browser took 0.74s
Importing nbclassic took 0.00s
Importing nbdime took 0.00s
Importing nbgitpuller took 0.01s
Importing notebook_shim took 0.00s
Importing panel.io.jupyter_server_extension took 4.73s
Importing xarray_leaflet took 2.44s
[I 2023-01-12 04:14:07.700 ServerApp] dask_labextension | extension was successfully linked.
[I 2023-01-12 04:14:07.701 ServerApp] jupyter_resource_usage | extension was successfully linked.

panel added 5s and xarray-leaflet added almost 2.5s to startup time.

yuvipanda commented 1 year ago

Verifiable separately:

(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import panel.io.jupyter_server_extension'

real    0m7.538s
user    0m6.600s
sys     0m0.870s
(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import xarray_leaflet'

real    0m6.586s
user    0m5.678s
sys     0m0.909s

compared to, say:

(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import jupyterlab'

real    0m0.310s
user    0m0.245s
sys     0m0.062s

So the true culprits here are xarray_leaflet and panel.io's server extension, and them doing a ton of imports of slow modules at import time, blocking up the start of the jupyterhub singleuser server.

yuvipanda commented 1 year ago

I opened https://github.com/holoviz/panel/issues/4271 in panel

yuvipanda commented 1 year ago

Since server startup is definitely much faster when .pyc files are present, I opened a PR in pangeo-docker-images to stop deleting them: https://github.com/pangeo-data/pangeo-docker-images/pull/426.

I also opened https://github.com/xarray-contrib/xarray_leaflet/issues/79 in xarray-leaflet

yuvipanda commented 1 year ago

I've also made this PR to jupyter_server to surface this performance information more easily, so we can see the culprits: https://github.com/jupyter-server/jupyter_server/pull/1171

philippjfr commented 1 year ago

Will make Panel 0.14.3 available some time next week which should reduce import time by at least half based on my testing.

yuvipanda commented 1 year ago

Amazing, @philippjfr! Thank you for your prompt work :)

I also opened https://github.com/jupyter-server/jupyter_server/issues/1177 to get jupyter-server to stop importing python packages if they are disabled.

damianavila commented 1 year ago

All this work upstream of us is beautiful, @yuvipanda!

yuvipanda commented 1 year ago

Here's a PR addressing the issue of loading disabled extensions https://github.com/jupyter-server/jupyter_server/pull/1180

yuvipanda commented 1 year ago

@jcrist suggested in https://hachyderm.io/@jcristharif/109707037385576616 that the performance issue might be the write time of .pyc files, not the parsing itself - and suggested a method to test this.

So I did!

No .pyc files, no env var:

(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import panel.io.jupyter_server_extension'

real    0m18.088s
user    0m10.299s
sys     0m1.258s
(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import panel.io.jupyter_server_extension'

real    0m4.797s
user    0m3.652s
sys     0m0.521s

The .pyc files are created during the first invocation, so the second invocation is about 360% faster.

No .pyc files, but PYTHONDONTWRITEBYTECODE=1

(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import panel.io.jupyter_server_extension'

real    0m12.763s
user    0m9.966s
sys     0m0.783s
(notebook) jovyan@jupyter-yuvipanda:~$ time python -c 'import panel.io.jupyter_server_extension'

real    0m12.287s
user    0m10.496s
sys     0m0.784s

I validated that no .pyc files are created, and the second invocation is about the same speed.

So from this, while not creating .pyc files at all does help with startup performance (12s vs 18s), it's still a lot more than with .pyc files present (~4s).

I checked the size of the .pyc files created, and that's 17MB in this case. I think shipping 17MB is a reasonable trade-off here...

jcrist commented 1 year ago

Thanks for the writeup Yuvi! Turns out my intuition was wrong (as it often is). Your conclusion to not strip the .pyc files makes sense to me :+1:.

yuvipanda commented 1 year ago

Yay ty @jcrist :) Would you consider adding a note to your blog post about the tradeoff for startup time with .pyc files?

yuvipanda commented 1 year ago

To mark this issue as completed, I'd want to bump the LEAP hub image to include all the fixes we have in place.

I opened https://github.com/pangeo-data/pangeo-docker-images/pull/433 to include xarray-leaflet. I don't think bumping jupyter_server is required, but would be good to get panel in there.

consideRatio commented 1 year ago

@yuvipanda amazing work and writeup!!

I checked checkboxes in the top post that was completed and communicated back to Julius as the LEAP hub now uses the latest docker image which includes xarray-leaflet etc.

image

damianavila commented 1 year ago

Figure out if we can get https://jcristharif.com/conda-docker-tips.html amended

Is this actually needed to close this one?