Avoid restarting user notebooks when restarting jupyterhub

jupyterhub / batchspawner

Custom Spawner for Jupyterhub to start servers in batch scheduled systems

BSD 3-Clause "New" or "Revised" License

190 stars 134 forks source link

Avoid restarting user notebooks when restarting jupyterhub #134

Closed guillaumeeb closed 5 years ago

guillaumeeb commented 5 years ago

Hi all,

I'm not sure this is a batchspawner issue, I'm not really understanding all Jupyterhub machinery.

I'm currently administering a Jupyterhub instance using batchspawner. Sometimes I make small changes to the configuration (say add a configuration in profilespawner for example), and restart the hub to take them into account. Unfortunately, this stops all running user notebooks. On another test instance, which is probably a little buggy, the hub restarting (or probably the stopping part) is somehow failing a little, and user notebooks are not shutdown. And I find this great. This allows me to perform changes without impacting users currently using the hub.

So the question is: is there a way to restart the hub without stopping all running notebooks? The hub should be able to retrieve running notebooks through its state, doesn't it?

rkdarst commented 5 years ago

Check out the variable c.JupyterHub.cleanup_servers = False, it does exactly what you want. The main condition is that the database is persistent.

It would also be useful to start the proxy separately from the hub, so that user servers stay connected even if the hub is down. There was a little bit of documentation on this, but I think it could use some more. Check these config options, but this is probably something where you want to really understand the role of proxy and hub and how they communicate to be able to debug problems:

c.ConfigurableHTTPProxy.should_start = False
c.ConfigurableHTTPProxy.auth_token = ...
# ... plus possibly configuring hub and proxy URLs.

... and the benefits of this are not that large for starters, so I wouldn't do it right away. It's late here and I can't find any docs at all right now...

guillaumeeb commented 5 years ago

Thanks @rkdarst, it seems indeed that this is what I'm after! I will dig into this. I was under the impression that the default config with sqlite was persisting the database on file system (or was it something I configured?).

For the proxy, it is not a big problem if users lost their connection, as long as they can reconnect to their notebook and continue they work.

rkdarst commented 5 years ago

Thanks @rkdarst, it seems indeed that this is what I'm after! I will dig into this. I was under the impression that the default config with sqlite was persisting the database on file system (or was it something I configured?).

It does, I think, preserve the database by default. But also defaults to stopping servers, because that is "safer" in that there are no dangling processes unless the users says they know they will be used/accessible/cleaned up later.

guillaumeeb commented 5 years ago

unless the users says they know they will be used/accessible/cleaned up later.

Using batchspawner, we know that job scheduler will ultimately kill the job, so I think this is OK.

The configuration worked like a charm, thanks @rkdarst.