jupyter / docker-stacks

Ready-to-run Docker images containing Jupyter applications
https://jupyter-docker-stacks.readthedocs.io
Other
8.02k stars 2.99k forks source link

Install `jupyterhub-base` and `nodejs` packages instead of `jupyterhub` package #2171

Closed consideRatio closed 2 weeks ago

consideRatio commented 3 weeks ago

Describe your changes

If you'd do pip install jupyterhub, you get the Python package jupyterhub. If you do mamba install jupyterhub however, you also get nodejs and configurable-http-proxy. Those are required if you are to run a jupyterhub server, but isn't required in order to start a jupyter server and establish contact with a jupyterhub server via the jupyterhub-singleuser command.

This PR makes us install jupyterhub-base instead, which is more like pip install jupyterhub as it doesn't bundle with other things to start jupyterhub as a server. To avoid a breaking change, nodejs is now explicitly installed as that could be a used dependency for people and could perhaps benefit from being removed in independent on this change.

Related issue

nodejs was found to be downgraded from a modern version to the very old node 12 within the issue https://github.com/jupyter/docker-stacks/issues/2170. It seems that its only a dependency for jupyterhub (and via its dependency on configurable-http-proxy), so by switching to installing jupyterhub-base, nodejs may not get installed any more.

I think if nodejs is wanted, then we should install it directly and not rely on getting it via jupyterhub the conda-forge package.

Checklist (especially for first-time contributors)

manics commented 3 weeks ago

Removing nodejs will be a breaking change for some users since it's needed for building custom JupyterLab extensions. For example https://github.com/jupyterhub/jupyter-remote-desktop-proxy/blob/4f68d135bde2bc06dd59762d4c70565c031ac34d/Dockerfile#L60 assumes nodejs is present

consideRatio commented 3 weeks ago

I figure this change should be isolated then, where the removal of nodejs is done separately.

mathbunnyru commented 3 weeks ago

LGTM

You need to add jupyterhub-base to EXCLUDED_PACKAGES here to pass the tests: https://github.com/jupyter/docker-stacks/blob/main/tests/docker-stacks-foundation/test_packages.py#L70

consideRatio commented 3 weeks ago

It seems that we get a modern nodejs installed now at least, and that configurable-http-proxy isn't added to this image any more.

nodejs                    22.9.0               h8374285_0    conda-forge
mathbunnyru commented 3 weeks ago

Could you please also update https://github.com/jupyter/docker-stacks/blob/main/docs/using/selecting.md#jupyterbase-notebook?

mathbunnyru commented 2 weeks ago

I will merge this - I think having the wrong nodejs version might affect some people and cause unexpected results. Will update the selecting.md after merge.

consideRatio commented 2 weeks ago

Thank you @mathbunnyru for completing this PR and for the amazing work you do in this project!!

ykazakov commented 2 weeks ago

Do I understand it correctly that the new docker image now cannot be used to start jupyterhub out of the box?

Previously I could do:

% docker run -it --rm quay.io/jupyter/minimal-notebook:2024-11-04 jupyterhub --JupyterHub.authenticator_class='dummy' --JupyterHub.spawner_class='simple'
Entered start.sh with args: jupyterhub --JupyterHub.authenticator_class=dummy --JupyterHub.spawner_class=simple
Running hooks in: /usr/local/bin/start-notebook.d as uid: 1000 gid: 100
Done running hooks in: /usr/local/bin/start-notebook.d
Running hooks in: /usr/local/bin/before-notebook.d as uid: 1000 gid: 100
Sourcing shell script: /usr/local/bin/before-notebook.d/10activate-conda-env.sh
Done running hooks in: /usr/local/bin/before-notebook.d
Executing the command: jupyterhub --JupyterHub.authenticator_class=dummy --JupyterHub.spawner_class=simple
[I 2024-11-07 10:02:37.736 JupyterHub app:3346] Running JupyterHub version 5.2.1
[I 2024-11-07 10:02:37.736 JupyterHub app:3376] Using Authenticator: jupyterhub.auth.DummyAuthenticator-5.2.1
[I 2024-11-07 10:02:37.736 JupyterHub app:3376] Using Spawner: jupyterhub.spawner.SimpleLocalProcessSpawner-5.2.1
[I 2024-11-07 10:02:37.736 JupyterHub app:3376] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-5.2.1
[I 2024-11-07 10:02:37.742 JupyterHub app:1876] Writing cookie_secret to /home/jovyan/jupyterhub_cookie_secret
[I 2024-11-07 10:02:37.765 alembic.runtime.migration migration:215] Context impl SQLiteImpl.
[I 2024-11-07 10:02:37.765 alembic.runtime.migration migration:218] Will assume non-transactional DDL.
[I 2024-11-07 10:02:37.768 alembic.runtime.migration migration:623] Running stamp_revision  -> 4621fec11365
[I 2024-11-07 10:02:37.871 JupyterHub proxy:556] Generating new CONFIGPROXY_AUTH_TOKEN
[W 2024-11-07 10:02:37.878 JupyterHub auth:1508] Using testing authenticator DummyAuthenticator! This is not meant for production!
[I 2024-11-07 10:02:37.896 JupyterHub app:3416] Initialized 0 spawners in 0.005 seconds
[I 2024-11-07 10:02:37.899 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.twenty_four_hours
[I 2024-11-07 10:02:37.899 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.seven_days
[I 2024-11-07 10:02:37.899 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.thirty_days
[W 2024-11-07 10:02:37.900 JupyterHub proxy:748] Running JupyterHub without SSL.  I hope there is SSL termination happening somewhere else...
[I 2024-11-07 10:02:37.900 JupyterHub proxy:752] Starting proxy @ http://:8000
10:02:38.197 [ConfigProxy] info: Proxying http://*:8000 to (no default)
10:02:38.199 [ConfigProxy] info: Proxy API at http://127.0.0.1:8001/api/routes
[I 2024-11-07 10:02:38.391 JupyterHub app:3739] Hub API listening on http://127.0.0.1:8081/hub/
10:02:38.391 [ConfigProxy] info: 200 GET /api/routes 
10:02:38.392 [ConfigProxy] info: 200 GET /api/routes 
[I 2024-11-07 10:02:38.392 JupyterHub proxy:477] Adding route for Hub: / => http://127.0.0.1:8081
10:02:38.394 [ConfigProxy] info: Adding route / -> http://127.0.0.1:8081
10:02:38.394 [ConfigProxy] info: Route added / -> http://127.0.0.1:8081
10:02:38.394 [ConfigProxy] info: 201 POST /api/routes/ 
[I 2024-11-07 10:02:38.394 JupyterHub app:3770] JupyterHub is now running at http://:8000

With the new latest image I cannot start jupyterhub any longer:

% docker run -it --rm quay.io/jupyter/minimal-notebook:2024-11-06 jupyterhub --JupyterHub.authenticator_class='dummy' --JupyterHub.spawner_class='simple'
Entered start.sh with args: jupyterhub --JupyterHub.authenticator_class=dummy --JupyterHub.spawner_class=simple
Running hooks in: /usr/local/bin/start-notebook.d as uid: 1000 gid: 100
Done running hooks in: /usr/local/bin/start-notebook.d
Running hooks in: /usr/local/bin/before-notebook.d as uid: 1000 gid: 100
Sourcing shell script: /usr/local/bin/before-notebook.d/10activate-conda-env.sh
Done running hooks in: /usr/local/bin/before-notebook.d
Executing the command: jupyterhub --JupyterHub.authenticator_class=dummy --JupyterHub.spawner_class=simple
[I 2024-11-07 10:03:52.881 JupyterHub app:3346] Running JupyterHub version 5.2.1
[I 2024-11-07 10:03:52.881 JupyterHub app:3376] Using Authenticator: jupyterhub.auth.DummyAuthenticator-5.2.1
[I 2024-11-07 10:03:52.881 JupyterHub app:3376] Using Spawner: jupyterhub.spawner.SimpleLocalProcessSpawner-5.2.1
[I 2024-11-07 10:03:52.881 JupyterHub app:3376] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-5.2.1
[I 2024-11-07 10:03:52.883 JupyterHub app:1876] Writing cookie_secret to /home/jovyan/jupyterhub_cookie_secret
[I 2024-11-07 10:03:52.901 alembic.runtime.migration migration:207] Context impl SQLiteImpl.
[I 2024-11-07 10:03:52.902 alembic.runtime.migration migration:210] Will assume non-transactional DDL.
[I 2024-11-07 10:03:52.908 alembic.runtime.migration migration:618] Running stamp_revision  -> 4621fec11365
[I 2024-11-07 10:03:53.044 JupyterHub proxy:556] Generating new CONFIGPROXY_AUTH_TOKEN
[W 2024-11-07 10:03:53.052 JupyterHub auth:1508] Using testing authenticator DummyAuthenticator! This is not meant for production!
[I 2024-11-07 10:03:53.068 JupyterHub app:3416] Initialized 0 spawners in 0.005 seconds
[I 2024-11-07 10:03:53.071 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.twenty_four_hours
[I 2024-11-07 10:03:53.072 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.seven_days
[I 2024-11-07 10:03:53.072 JupyterHub metrics:373] Found 0 active users in the last ActiveUserPeriods.thirty_days
[W 2024-11-07 10:03:53.072 JupyterHub proxy:748] Running JupyterHub without SSL.  I hope there is SSL termination happening somewhere else...
[I 2024-11-07 10:03:53.072 JupyterHub proxy:752] Starting proxy @ http://:8000
[E 2024-11-07 10:03:53.073 JupyterHub proxy:760] Failed to find proxy ['configurable-http-proxy']
    The proxy can be installed with `npm install -g configurable-http-proxy`.To install `npm`, install nodejs which includes `npm`.If you see an `EACCES` error or permissions error, refer to the `npm` documentation on How To Prevent Permissions Errors.
[C 2024-11-07 10:03:53.073 JupyterHub app:3700] Failed to start proxy
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.12/site-packages/jupyterhub/app.py", line 3698, in start
        await self.proxy.start()
      File "/opt/conda/lib/python3.12/site-packages/jupyterhub/proxy.py", line 756, in start
        self.proxy_process = Popen(
                             ^^^^^^
      File "/opt/conda/lib/python3.12/subprocess.py", line 1026, in __init__
        self._execute_child(args, executable, preexec_fn, close_fds,
      File "/opt/conda/lib/python3.12/subprocess.py", line 1955, in _execute_child
        raise child_exception_type(errno_num, err_msg, err_filename)
    FileNotFoundError: [Errno 2] No such file or directory: 'configurable-http-proxy'

What is exactly the purpose of jupyterhub-base?

ykazakov commented 2 weeks ago

OK, it looks like jupyterhub-base is a common package for jupyterhub (for starting the server) and jupyterhub-singleuser (for starting the user servers).

% conda repoquery depends jupyterhub -c conda-forge 
Collecting package metadata: done
 Name                     Version Build                Channel     Subdir   
─────────────────────────────────────────────────────────────────────────────
 jupyterhub               5.2.1   pyh31011fe_0         conda-forge noarch   
 psutil                   5.7.2   py38h51573d8_1       conda-forge osx-arm64
 __unix >>> NOT FOUND <<<                              localhost            
 pycurl                   7.45.1  py310h70149c3_0      conda-forge osx-arm64
 python                   3.9.20  h9e33284_1_cpython   conda-forge osx-arm64
 nodejs                   22.9.0  h08fde81_0           conda-forge osx-arm64
 configurable-http-proxy  4.6.2   hb67532b_1           conda-forge osx-arm64
 jupyterhub-base          5.2.1   pyh31011fe_0         conda-forge noarch  
% conda repoquery depends jupyterhub-singleuser -c conda-forge
Collecting package metadata: done
 Name                     Version Build                Channel     Subdir   
─────────────────────────────────────────────────────────────────────────────
 jupyterhub-singleuser    5.2.1   pyh31011fe_0         conda-forge noarch   
 __unix >>> NOT FOUND <<<                              localhost            
 jupyterhub-base          5.2.1   pyh31011fe_0         conda-forge noarch   
 jupyterlab               4.3.0   pyhd8ed1ab_0         conda-forge noarch   
 python                   3.9.20  h9e33284_1_cpython   conda-forge osx-arm64

However, the package jupyterhub-singleuser does not seem to be installed in the image?

% docker run -it --rm quay.io/jupyter/minimal-notebook:2024-11-06 conda list | grep jupyter
jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.3              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2              pyh31011fe_1    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterhub-base           5.2.1              pyh31011fe_0    conda-forge
jupyterlab                4.2.5              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.27.3             pyhd8ed1ab_0    conda-forge

So the image cannot be used (out of the box) for starting jupyterhub user servers either?

ykazakov commented 2 weeks ago

Interestingly, the command jupyterhub-singleuser is available in the image. Does it come from jupyterhub-base?

ykazakov commented 2 weeks ago

Looks like it is indeed installed by jupyterhub-base

% docker run -it --rm quay.io/jupyter/minimal-notebook:2024-11-06 grep jupyterhub-singleuser /opt/conda/conda-meta/jupyterhub-base-5.2.1-pyh31011fe_0.json
...
"bin/jupyterhub-singleuser"
        "_path": "bin/jupyterhub-singleuser",
consideRatio commented 2 weeks ago

Yes jupyterhub the conda-forge package included configurable-http-proxy and nodeja as dependencies, if you look to use these images to start jupyterhub as a server as well, you can install them on top again

ykazakov commented 2 weeks ago

Sure I can install the full jupyterhub myself. I just I try to understand what is the best strategy of using the docker images to run jupyterhub. I like the idea of removing unnecessary packages. And for the same reason, I probably do not need most of the stuff in jupyter/minimal-notebook (e.g., jupyterlab) to run the jupyterhub server. However I need to ensure that the server and the user notebooks use the same version of jupyterhub, and for this reason I am a bit reluctant to switch to other images like jupyterhub/jupyterhub. If the jupyter docker images are supposed to be used with jupyterhub (which is the reason for installing jupyterhub-base), I think, it would be great if there were a separate docker image, e.g., jupyter/jupyterhub to run (just) the jupyterhub server.

consideRatio commented 2 weeks ago

If you have a jupyterhub version +-1 major version, you should be fine overall i think, keeping things aligned on the minor version is at least more alignment than needed.