Open rkevin-arch opened 4 years ago
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
One immediate question is where's the best place for this functionality? Should it be part of core BinderHub, or is it better placed as an add-on service that uses the BinderHub/JupyterHub APIs?
I think it would have to be in BinderHub because it needs to know that there is a pod "on standby" and redirect a user there instead of launching a new pod. Could we do that as a jupyterhub service?
I think we should try and implement this feature and #812 "together". I'd imagine this feature as starting with a option in the configuration that lets you specify a "repository reference" (github repo plus SHA1 or Zenodo reference or ...) and a number of "standby pods" to have (how much spare should there be?).
I'd implement this as a new "service" in BinderHub that queries the k8s API to find out how many active pods there are for the specified repo. Maybe using a taint or annotation for the pods to mark them as "not used by a human right now". To tell the difference between which pods are actively being used on which ones are "on standby". That way we should be able to have BinderHub restart "from scratch" and be able to recover its state from the k8s API and maybe even solve the challenge of having several BinderHub pods running at the same time (for better availability).
TL;DR: this would be super cool to have!
@betatim Makes sense! BinderHub is currently stateless so recording the state as pod annotations solves the problem. The alternative might be to store that information as JupyterHub authstate? I'm not sure what's easier.
There is also https://github.com/jupyterhub/mybinder.org-deploy/issues/1038 which might already reduce startup time enough? But it would never be as fast as having a pod on standby.
We also have a thread about having a "scratch pad" image that people who just want an/any environment that launches fast. We could link that here as having that launch fast would be cool. If we can locate that thread/post that would be great. edit: this is that thread https://github.com/jupyterhub/mybinder.org-deploy/issues/1474
We are already prepulling images on all of our nodes (pretty much manually). Otherwise, the startup time would be 10~15 minutes (we have very big images). Still, it takes a bit of time to spawn the pod.
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/embed-binder-related-metadata-in-notebook/10329/6
This is sort of related to #812.
Proposed change
Add the ability to spawn a pod (or multiple) of a specified repository on standby. Once a user requests a pod with that repo, just give them a standby pod and spawn a new one to put on standby.
Alternative options
Not doing the above. Spawning a pod, even with an already built and pulled image, takes around 10~20 seconds.
Who would use this feature?
We (LibreTexts) run our own binder for our textbooks, which have a bunch of Thebe cells. Right now, our users using those Thebe cells need to wait 10~20 seconds before the code can actually run. Since we know which image we're using for our purposes, having pre-spawned pods on standby would allow the user to start running code almost instantaneously.
(Optional): Suggest a solution
I can put in some time and implement this if people think it's a feature BinderHub should have. It sounds straightforward enough. So far, the only concern I have is how to disable the culler for the pre-spawned pods (or maybe not disabling them but keep spawning new pods when they die), but I can look into that. Just let me know.