jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.52k stars 791 forks source link

hub error 'User user1 is slow to start (timeout=10)' despite jupyter pod running #753

Closed nethsix closed 4 years ago

nethsix commented 6 years ago

Platform: google cloud Jupyterhub helm chart: v0.6.0-d359d98

I successfully setup and run zero-to-jupyterhub-k8s. However a day or so, I will no longer be able to login.

Initially I thought this could be resource issue, so I upgraded my k8s nodes to 'n1-highmem-2', which is:

Upon checking further, from the logs I show below, it seems that the hub is complaining about the jupyter pod not spawning quickly enough but the logs on the jupyter pod shows that it is already running.

I thought I could resolve this by setting http_timeout, start_timeout of the hub (see config.yaml), but it has no effect whatsoever.

I deployed using helm as follows:

helm install jupyterhub/jupyterhub --version=v0.6.0-d359d98 --name=fastowl-tf-timeout-jupyter --namespace=fastowl-tf-timeout-space -f config/k8s-config-github-tf-timeout.yml --debug --timeout=99999

Please help! Thank you.

Contents of config.yaml

hub:
  cookieSecret: "cccccccccc"
  extraConfig: |
    c.Spawner.http_timeout = 3000
    c.Spawner.start_timeout = 3000
token:
  proxy: "bbbbbbbbb"
proxy:
  secretToken: "aaaaaaaaa"
singleuser:
  image:
    name: jupyter/tensorflow-notebook
    tag: 8a1b90cbcba5
auth:
  type: github
  github:
    clientId: "xxxxxxxxxx"
    clientSecret: "yyyyyyyyyyyyy"
    callbackUrl: "http://tensorflow.fastowl.xyz/hub/oauth_callback"

Hub logs:

kubectl --namespace xyz log hub-xxxxxxxxxx-dzxhh

[W 2018-07-08 02:25:41.372 JupyterHub base:483] User user1 is slow to start (timeout=10)   [258/1822]
[I 2018-07-08 02:25:41.372 JupyterHub base:751] user1 is pending spawn
[I 2018-07-08 02:25:41.379 JupyterHub log:122] 200 GET /hub/user/user1/ (user1@10.0.9.1) 10059.79ms
[I 2018-07-08 02:25:52.084 JupyterHub log:122] 302 GET /hub/home → /hub/login?next=%2Fhub%2Fhome (@103.
5.140.162) 0.75ms
[I 2018-07-08 02:25:52.092 JupyterHub log:122] 200 GET /hub/login?next=%2Fhub%2Fhome (@103.5.140.162) 1
.33ms
[I 2018-07-08 02:25:58.432 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:25:58.432 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:26:15.024 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:26:15.025 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:26:30.231 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:26:30.231 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:26:45.450 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:26:45.450 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:27:00.656 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:27:00.656 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:27:15.856 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:27:15.856 JupyterHub base:727] user1 is pending spawn
[I 2018-07-08 02:27:31.125 JupyterHub base:722] Pending spawn for user1 didn't finish in 10.0 seconds
[I 2018-07-08 02:27:31.125 JupyterHub base:727] user1 is pending spawn

Jupyter node logs:

[W 2018-07-08 02:38:33.412 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2018-07-08 02:38:33.672 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-07-08 02:38:33.672 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-07-08 02:38:33.683 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[I 2018-07-08 02:38:33.690 SingleUserNotebookApp notebookapp:1585] Serving notebooks from local directory: /home/jovyan
[I 2018-07-08 02:38:33.690 SingleUserNotebookApp notebookapp:1585] 0 active kernels
[I 2018-07-08 02:38:33.690 SingleUserNotebookApp notebookapp:1585] The Jupyter Notebook is running at:
[I 2018-07-08 02:38:33.690 SingleUserNotebookApp notebookapp:1585] http://0.0.0.0:8888/user/user1/
[I 2018-07-08 02:38:33.690 SingleUserNotebookApp notebookapp:1586] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
consideRatio commented 6 years ago

Hey! Im not 100% but I think whats going on is that the image jupyter/tensorflow-notebook is clueless about the need to communikate back to the hub and say "here i am, it worked!"

~Idea 1. We change the command you exexute on the image.~ ~Idea 2. If needed, we install jupyterhub on top of the image and then does idea 1.~

Hmmm... The image is based on https://github.com/jupyter/docker-stacks... Kubespawner should update the command appropriately...

nethsix commented 6 years ago

@consideRatio Yes. The image is from jupyterhub repo. However, you just reminded me of a possibly similar issue that I had with this unofficial pytorch jupyterhub image https://hub.docker.com/r/stepankuzmin/pytorch-notebook/ (https://github.com/stepankuzmin/pytorch-notebook). That image was built based on the official jupyter/scipy-notebook:latest, but I got errors trying to run it in jupyterhub (error details here), eventually I rebuilt it and it worked.

manics commented 4 years ago

I'm closing this as it's quite old and it's a support issue. If you're still having problems please post on the Jupyter Community Forum. Thanks!