Closed krinsman closed 6 years ago
Edit:
_orm_spawners
attribute of the (ORM) User
class already supports multiple instances of spawners, each (seemingly) corresponding to a different named notebook server at least based on how the (data encapsulated version of) the attribute is used in the (non-ORM) User
class.Although this conclusion directly contradicts what it says in the Spawner
class here:
There it says that there is a one-to-one correspondence between users and spawners, 1 user, 1 spawner. But the above interpretation of the ORM suggests that each user can have multiple instances of spawners (i.e. more than, and not, just one), with each instance corresponding to a different Jupyter Notebook server.
Thanks for digging in to this!
First, I would clarify that named servers do already work and are fully implemented in JupyterHub 0.8. They just need to be created via the REST API, rather than human-clicky UI controls. This is what is tested in test_named_servers
. What's missing and required for #1935 is UI for this, and likely updates to a lot of default behavior regarding assumptions that the 'default' server should be selected/launched in various circumstances (e.g. automatically triggering spawn of the default server at login).
Some additions to the orm
It's likely there should be no changes required to the ORM, since named servers are already implemented there. Once we have UI, we may seek additional features such as persisting user_option as discussed in #1935, but that can be done later.
Some additions to the JupyterHub application which cause different behavior to occur when self.allow_named_servers is set to True
These will mostly involve 'default' behaviors, such as triggering spawn on login, default URL, etc. Right now, allow_named_servers
only gates the named-server endpoints in the REST API. It is not a dummy attribute, though.
Some additions to the User class which will again cause different behavior when self.allow_named_servers is set to True
It's unlikely that the User class will need significant changes. Most of the changes will probably belong in how the user object is communicated with at the Handler level, in particular cases where there are references to user.spawner
and user.pending
, etc. which assume that the default Spawner's state is relevant. We will need to revisit many of these.
Currently, it seems that all of the notebook server associated with a given user have to be/are expected to be on the same host. Is this correct?
It is not correct. Servers may be anywhere network-accessible from the Hub/proxy. Spawner.start
returns an (ip, port)
tuple describing where the running server can be accessed. It does not need to be local.
Is the idea that each of the named servers would be identified by a given method by the parameter value server_name?
Yes, this is how it works right now. The default is '', but this is settable if servers are launched via the REST API
And is the conclusion correct, that there is currently no connection implemented between the value of allow_named_servers and the possible input values for server_name?
It is not. allow_named_servers=False
prevents any attempts to launch servers with a name other than the default 'nameless' empty string. Only when allow_named_servers=True
can users request servers with non-empty names.
The analogous question might apply for progress_url as well, but to be honest I don't really understand what that method is intended to do, so I am not actually sure.
progress_url should already be doing the right thing for named servers.
The spawners attribute of the User class wouldn't really have anything to do with any named server implementation, right?
User.spawners
is where all of the Spawner instances for the named servers reside. It is a dictionary of Spawner instances keyed by the server name. Only when named servers are enabled does this dictionary have more than one entry. It's not listing possible names, it's listing currently allocated names. When you ask for a server named "server2"
, then user.servers["server2"]
will be the Spawner instance corresponding to that name.
there is no reason to expect there to be any mapping (mathematical function, each input has unique output) between possible spawners, possible names for users' notebook servers, and possible (remote) hosts on which the users' notebook servers could be hosted?
There is a 1:1 mapping of name to existing Spawner. Every Spawner has a name, and names are unique per user. The names are not semantic, they are just user-chosen keys in a dictionary, so there is nothing related to hostnames here unless the Spawner provides it and the user chooses to encode that in the name.
This could probably also be merged with questions 5 and 6, but just to confirm, there is no reason to expect that anything needs to be done with or to the _SpawnerDict class in order to implement named servers?
This is already where named servers are stored. It is unlikely that anything needs to change here.
Does the test_named_servers.py contain a (mock) implementation of the named servers feature?
No, it contains a full implementation of named servers via the REST API.
Although this conclusion directly contradicts what it says in the Spawner class here:
Yes, that docstring hasn't been updated since named servers were introduced. It should be updated to account for named servers, since it will be at least N (one or more per user, or exactly one per running notebook server) rather than one per user.
@minrk Thank you for taking the time to respond to this -- I really appreciate it.
My impression now, based on more review of the code since posting the question, as well as your very thorough and informative answer, is that most of the work with regards to implementing named servers for UI will involve the code in jupyterhub/handlers
(and not, of course, jupyterhub/apihandlers
, since as you point out named servers are already implemented via the API).
Question: In fewer words, given that I am interested in implementing a UI for the named servers, would you recommend that I focus above all on the code for the handlers (jupyterhub/handlers
)?
As far as I can tell, based on trying to better understand test_named_servers.py
, the "magic" for implementing named servers via the API is primarily via the UserServerAPIHandler
class in jupyterhub/apihandlers/users.py
-- is that correct?
If so, what would you say is the most analogous class in jupyterhub/handlers
that one should focus on first in trying to implement named servers for UI? Would it be SpawnHandler
in jupyterhub/handlers/pages.py
? And if so, do you think it would make more sense to create a new handler altogether, or only to modify this and other relevant existing handlers?
If it is actually the spawning features of BaseHandler
which one should look at first, is it terms of changing defaults? That one seems less likely to me, since UserServerAPIHandler
already implements named servers via the API and is a "grandchild" of BaseHandler
. But needless to say I still don't fully understand all of the internals of what happens when a user spawns a new notebook server.
UserSpawnHandler is responsible for serving the /users/yourname
page when a server is not running at that URL. This is what triggers spawning right now (i.e. attempting to visit a server triggers its launch). Our choices right now are:
I suspect the latter is what we will want, which means that most work will really belong in the home.html
template and home.js
javascript to build the UI that calls the REST API to launch named servers and then redirect to them.
This may be considered a hijack of the thread, so feel free to kick me out :)
However, if there was a move towards (2) and making the webUI just perform REST API calls, could this help us to facilitate the opposing use cases we chatted about before - that sometimes visiting a /user/yourname
page should automatically spawn a server (e.g. sharing a notebook link), and sometimes you don't want it to (e.g. a datascientist managing a self-service infrastructure, and wanting to stop consuming resources when their experiment is finished).
My thought process was that then all logic for this can move behind the API nicely, and options could be introduced to specify default handling (e.g. 'don't implicitly launch server'). This would then inform the way that the home page is rendered.
@davidbath It's OK, I am also curious to know the answer to your question.
We could definitely consider disabling spawn upon visiting /user/:name/...
. It may well make even more sense with named servers implemented, but we can do it anyway so that behavior is consistent. Then the case becomes: what to do with that URL? It would be quite reasonable, for instance, to show a button to trigger spawn, rather than triggering spawn immediately.
Most actions from pages are already API calls - stop is an API request, spawn from the admin page is already the same API request we would use for start on the home page. So making start be an API call would be quite simple. What would be a little less simple is handling visitors to /user/name/
in the various states that can occur (server starting but not ready, not running at all, etc.). Technically, it shouldn't be a challenge; it's mostly a matter of deciding what the user should experience in the various states everything can be in.
yes - I was thinking that getting people to the same "start my server" button rather than seamlessly auto-starting would be very reasonable. Indeed, this could always be a config option. Would there be a downside of allowing someone to set an single_user_autospawn_profile = [profile name]
which, if specified, would specify what happens if you just visited /user/:name/
? If it was unset, then it always takes you to the spawner profile page.
I have a slight spin on this question to add. We've been looking at wrapspawner/profilespawner (or customization of it) as a way to manage spawning servers on different systems and/or in a batch job. We thought we could enable named servers but assign the server name (just use the profile name for instance) instead of letting the user pick it. Mainly we just wanted a way to handle the case where users start >1 servers but need a way to keep them straight.
In that case the thing that seemed right to us at first was what I mentioned initially in #1935 (from the options form based on the selected profile). But I am trying to figure out/ask if @minrk's suggested approach through "home" (what's outlined at "There's another option..." here) is compatible with that.
For instance we could implement that but then in our custom case disable user definition of the name and set it when the spawn happens. We would still need all the other UI pieces in our case so we would want to keep working on that.
We thought we could enable named servers but assign the server name
Not at the moment, because the Spawner has been assigned a name before the object is created. It cannot pick its own name. That's not to say it can't be changed in the future, but that's how it works right now. So as it is right now, the spawner name is always picked by the user before any logic from the Spawner class is invoked.
@minrk This thread was basically created in order to ask about how to make this PR: https://github.com/jupyterhub/jupyterhub/pull/2089
Thus I am not sure if I should close this issue, and have all discussion relegated to that PR, or if both should be left open for now.
Made obsolete by https://github.com/jupyterhub/jupyterhub/pull/2089
I have some dumb questions about what changes and where need to be made to JupyterHub internals in order to enable the named servers feature. There are obviously a lot of them, so if it is only possible to answer one or two of them (due to time constraints and similar issues), I would still be immensely grateful if you did so. My goal is to avoid doing something massively stupid and consequently later making a garbage PR which even more completely wastes your time.
Here are my guesses about what at a minimum might need to be changed; please feel free to correct anything and everything that is wrong:
spawner.py
code, which I also don't understand that well.)self.allow_named_servers
is set toTrue
. (As far as I can tell, based on the source code and some tests, it is currently a "dummy attribute", inasmuch as whether or not it is true makes no difference in the program's behavior.self.allow_named_servers
is set toTrue
. My guess is that the most additions will need to occur in thespawn
andspawner
methods, but additions might also need to be made to theprogress_url
andstop
methods, since they haveserver_name
as possible parameter values.Here are some actual, concrete questions:
server_name
? (The current default values for that is always''
.) And is the conclusion correct, that there is currently no connection implemented between the value ofallow_named_servers
and the possible input values forserver_name
?server_name
an input value of the spawn method? Does this prevent the remote host for the new (named) notebook server for the user from being chosen dynamically at run-time? Imagine the case where there are two remote nodes on which users at an HPC can spawn notebook servers, and the administrators want to implement some sort of load-balancing scheme. Would some code like the following be necessary to add to thespawn
method in theUser
class:Obviously that is very stupid pseudo-code, but maybe it helps to communicate more clearly what I am trying to ask.
progress_url
as well, but to be honest I don't really understand what that method is intended to do, so I am not actually sure.spawners
attribute of theUser
class wouldn't really have anything to do with any named server implementation, right? It's not listing the possible remote hosts, or possible names that could be given to the notebook servers for a given user, it's just listing the possible spawning methods (e.g.DockerSpawner
orKubernetesSpawner
)?_SpawnerDict
class in order to implement named servers?Again, if it at all possible to help with even one of these questions/concerns, it would be immensely appreciated. Please enjoy the rest of your week!