wildtreetech / ohjh

ohjh - The OpenHumans JupyterHub deployment
MIT License
8 stars 7 forks source link

[MRG] Update to JupyterHub v0.9.2 #31

Closed betatim closed 6 years ago

betatim commented 6 years ago

This brings in the hub that will restart itself when there have been too many consecutive launch failures. This is a band aid for the problem with the event reflector failing. The event reflector is what the hub uses to b e notified about events happening in kubernetes world. This is how the hub notices that a user's pod has launched and it should send the user there. When the reflector fails the hub never notices and as a result the user never gets redirected even though the pod is up.

The fix is to restart the hub. This is now automated.

We can retire the custom hub image as all the custom things we had are now part of the JupyterHub release.

betatim commented 6 years ago

This has been deployed.

From a quick test I can login as openhumans and get a token. @gedankenstuecke can you test drive it a bit as well? Let's merge this PR once you did some checking and also think things work.

betatim commented 6 years ago

I had a look around to find references to having to build your own custom hub image in the docs/comments but couldn't find any. If there are more we should remove them else we will be confused.com in the future.

betatim commented 6 years ago

The band aid of "restart if a N launches failed" works well if you have 50 launches per minute but not so well if there are ~0 launches per 24h. SO this will be a less good fix for this kind of hub as it is for mybinder.org. #caveats

gedankenstuecke commented 6 years ago

I’ll keep an eye open for whether it works. And as the restarts: but it does count for a single user to, right? Eg if my image fails to start 3 times in a row it will lead to the cleanup?

betatim commented 6 years ago

Yes. Three consecutive failed launches will trigger the hub to reboot. We could even lower it to two? It doesn't matter who does the launches.

gedankenstuecke commented 6 years ago

Ok, for now things seem to work and I'll merge it, but I'll monitor the log files and keep trying to break it ;-)

betatim commented 6 years ago

In the mean time we had the first hub restart on mybinder.org. No one noticed :) So I guess that means it works.

gedankenstuecke commented 6 years ago

yay!