Closed betatim closed 6 years ago
This has been deployed.
From a quick test I can login as openhumans
and get a token. @gedankenstuecke can you test drive it a bit as well? Let's merge this PR once you did some checking and also think things work.
I had a look around to find references to having to build your own custom hub image in the docs/comments but couldn't find any. If there are more we should remove them else we will be confused.com in the future.
The band aid of "restart if a N launches failed" works well if you have 50 launches per minute but not so well if there are ~0 launches per 24h. SO this will be a less good fix for this kind of hub as it is for mybinder.org. #caveats
I’ll keep an eye open for whether it works. And as the restarts: but it does count for a single user to, right? Eg if my image fails to start 3 times in a row it will lead to the cleanup?
Yes. Three consecutive failed launches will trigger the hub to reboot. We could even lower it to two? It doesn't matter who does the launches.
Ok, for now things seem to work and I'll merge it, but I'll monitor the log files and keep trying to break it ;-)
In the mean time we had the first hub restart on mybinder.org. No one noticed :) So I guess that means it works.
yay!
This brings in the hub that will restart itself when there have been too many consecutive launch failures. This is a band aid for the problem with the event reflector failing. The event reflector is what the hub uses to b e notified about events happening in kubernetes world. This is how the hub notices that a user's pod has launched and it should send the user there. When the reflector fails the hub never notices and as a result the user never gets redirected even though the pod is up.
The fix is to restart the hub. This is now automated.
We can retire the custom hub image as all the custom things we had are now part of the JupyterHub release.