jupyterhub / jupyterhub

Multi-user server for Jupyter notebooks
https://jupyterhub.readthedocs.io
Other
7.81k stars 2.02k forks source link

Not passing token to NotebookApp spoils notebook discovery. #3605

Open marcinwrochna opened 3 years ago

marcinwrochna commented 3 years ago

Bug description

In short: jupyterhub.SingleUserApp does not pass the API token to notebook.NotebookApp, and because of that the token does not appear when listing running notebooks.

In detail: In many cases one needs to know the notebook path from inside notebook/kernel code, see for example this issue - in my case specifically I just want WandB to work, which essentially uses the same method as described in that closed issue: calling api/sessions on the NotebookApp server instance and the token listed by notebook.notebookapp.list_running_servers().

The problem is that an empty token is given by list_running_servers. Similarly jupyter notebook list lists the server but without a token. The token is available under the environment variable JUPYTERHUB_API_TOKEN (as well as the deprecated JPY_API_TOKEN), but that's a JupyterHub specific key.

As far as I understand the technical reason is that JupyterHub replaces authentication in jupyterhub.SingleUserApp by adding a mixin to notebook.NotebookApp. The mixin handles the authentication, so calling api/sessions with JUPYTERHUB_API_TOKEN works, but notebook.NotebookApp.token is set to an empty string (which is what is saved and later retrieved by list_running_servers).

I believe WandB should not be changed to use that JupyterHub specific key, but rather jupyterhub.SingleUserApp should be changed to assign it's internal api_token to notebook.NotebookApp.token (because we still want to expose the same API in the same way).

Expected behaviour

I believe that when calling the following code from inside a notebook in JH:

from notebook.notebookapp import list_running_servers
list(list_running_servers())

the token should be non-empty. Similarly in jupyter notebook list.

Actual behaviour

I get a list with a single server (with the url port pointing directly to the notebookApp, not the traefik proxy), but under the key "token" I get an empty string.

How to reproduce

I use basically the default the-littlest-jupyterhub configuration on Ubuntu 20.04, python 3.8, probably any JupyterHub configuration which spawns NotebookApp-s via SingleUserApp will have the same issue.

$ cat /opt/tljh/user/etc/jupyter/jupyter_config.json { "JupyterApp": { "kernel_spec_manager_class": "nb_conda_kernels.CondaKernelSpecManager" } }


</details>
welcome[bot] commented 3 years ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

minrk commented 3 years ago

I would not necessarily expect a jupyterhub-singleuser instance to show up in jupyter notebook list, nor, necessarily, that JUPYTERHUB_API_TOKEN have permission to access the notebook API (in JupyterHub 2.0, it may not have the required access:server scope).

It should not be assumed that token is defined in the jupyter notebook list output. It's not defined when a user has set a password for their server, for instance, because there is no token in that case.

I'll have to think about what's a reasonable level of support for this in jupyterhub, which isn't really supported at the notebook level either, though the given approach does often happen to work.

marcinwrochna commented 3 years ago

Thanks, I now realize it might be more complicated than just copying JUPYTERHUB_API_TOKEN.

It's true that at the notebook level the issue of getting a notebook's path is still open and somewhat hot. It seems to me there's currently just no good way to do this for password-authenticated notebook servers, and they're working on it. I suppose the real question for JH is:

  1. how should a script with super-user rights access the notebook API when authentication is managed by SingleUserNotebookAppMixin.
  2. how should a python script running inside a notebook kernel do that (for it's own server – I assume it should be allowed to, as long as its the same unix user?).

One potential problem that I see is that, as far as I understand, jupyter notebook list assumes there is a single token, set up at server start, which will never expire (so it can write it to a file and later read it).

BTW other issues with essentially the same problem appeared earlier as 1718, 2805, 3486, all closed without really solving. Also here you suggested using JUPYTERHUB_API_TOKEN and jupyter notebook list.