jupyterhub / jupyterhub-idle-culler

JupyterHub service to cull idle servers and users
Other
102 stars 37 forks source link

Cull doesnt respect timeout, terminates singleuser web sessions #47

Closed ospiegel91 closed 2 years ago

ospiegel91 commented 2 years ago

Bug description

Using a long cull timeout of 3 days, Jupyterhub singleUser pods get terminated after short period of time.

Expected behaviour

Considering a long cull timeout I would expect the singleUser pod to remain alive for at least that specified timeout duration

Actual behaviour

Jupyterlab singleuser pod is terminated unless the user is actively on the broswer tab session engaging with it.

How to reproduce

use these cull settings

cull:
  enabled: true
  users: false # --cull-users
  removeNamedServers: false # --remove-named-servers
  timeout: 259200 # --timeout 259200s = 3 days
  every: 600 # --cull-every
  concurrency: 10 # --concurrency
  maxAge: 259200 # --max-age

using the helm chart https://jupyterhub.github.io/helm-chart version 1.2.0

Your personal set up

cull:
  enabled: true
  users: false # --cull-users
  removeNamedServers: false # --remove-named-servers
  timeout: 259200 # --timeout 259200s = 3 days
  every: 600 # --cull-every
  concurrency: 10 # --concurrency
  maxAge: 259200 # --max-age

using the helm chart https://jupyterhub.github.io/helm-chart version 1.2.0, on an EKS cluster

# paste output of `pip freeze` or `conda list` here

Configuration ```python # jupyterhub_config.py ```
Logs ``` # paste relevant logs here, if any ```
welcome[bot] commented 2 years ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] commented 2 years ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

consideRatio commented 2 years ago

Logs of jupyterhub showing that the pods terminated by the culler are essential.

kubectl logs deploy/hub

ospiegel91 commented 2 years ago

@consideRatio thank you for the prompt response.

cull logs continuosly receive activity from ospiegel notebook until,

[W 2022-06-01 20:11:48.877 JupyterHub app:2151] User ospiegel91 server stopped with exit code: 1

to replicate:

  1. spawn notebook
  2. open notebook terminal
  3. run some silly while loop to keep some process active: while sleep 60; do echo “60 seconds passed”; done
  4. close web browser tab immediately after

The notebook lives past the tab exit. But exists randomly after.

consideRatio commented 2 years ago

It seems your server exits, without involvement from the culler.

You need to inspects its logs, why did the exit code become 1 etc? You should not see a warning, and a notice about the culler culling if it was the culler.

Please refer to discourse.jupyter.org for further help at this point.

ospiegel91 commented 2 years ago

@consideRatio thanks again, Would I see the logs containing what is causing error code 1 at the singleUser pod level?

consideRatio commented 2 years ago

I expect you to find logs for the user pod started by kubespawner under a pod named jupyter-<username>. So, kubectl logs <username of pod>, where you can also add --previous if the container has restarted and you want to see the logs of the container before restarting.

I hope this helps you track down whats going on, good luck!