jupyterhub / wrapspawner

Mechanism for runtime configuration of spawners for JupyterHub
BSD 3-Clause "New" or "Revised" License
60 stars 58 forks source link

New Custom Spawner #52

Open JeffEdwardsTech opened 2 years ago

JeffEdwardsTech commented 2 years ago

This Pull Request is for kind of WrapSpawner; the CustomSpawner.

The CustomSpawner implements a new feature set for launching jobs; it displays the batch script to be run as a textfield in the launcher form, and it also allows the user to customize the script before launching the job.

The profiles are stored in a System directory, allowing administrators to provide a global set of job profiles to be launched by storing files in the filesystem.

Optionally, users can store their own custom scripts to be launched in their own home directories if those directories are accessible by the JupyterHub server.

This feature is being developed at the request of the EPA's High End Scientific Computing II (HESC2) project; we want to support the JupyterHub community by contributing code and would love to see this become an official feature.

Contact me with any needed changes or enhancements. Looking forward to working with you all!

Jeff Edwards edwards.jedwards@epa.gov

welcome[bot] commented 2 years ago

Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly. welcome You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

manics commented 2 years ago

Hi! WrapSpawner is designed to wrap around almost any Jupyter spawner. Would you say your new spawner is generally applicable to other spawners, or is it targeted only at BatchSpawner?

JeffEdwardsTech commented 2 years ago

We would like it to be applicable to other spawners, however, we have not tested it with anything other than BatchSpawner at the moment.

On Tue, Apr 19, 2022 at 4:07 PM Simon Li @.***> wrote:

Hi! WrapSpawner is designed to wrap around almost any Jupyter spawner. Would you say your new spawner is generally applicable to other spawners, or is it targeted only at BatchSpawner?

— Reply to this email directly, view it on GitHub https://github.com/jupyterhub/wrapspawner/pull/52#issuecomment-1103073414, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX4ZBXKD3ANHVI3ZZJ6J4HDVF4G6NANCNFSM5THTSCBA . You are receiving this because you authored the thread.Message ID: @.***>

mbmilligan commented 2 years ago

Hi @JeffEdwardsTech - thanks for this PR!

Any questions about the following, please follow up here. I am supportive of accepting this feature if some restructuring can be done. We might be able to assist in implementing fixes to the issues I note below.

Cheers, @mbmilligan

So now that I've had a chance to review this, the biggest thing that jumps out at me is the amount of code duplication between this class and the existing ProfilesSpawner. It should be straightforward to refactor this into a subclass of ProfilesSpawner and avoid repeating code. See e.g. in CustomSpawner we have:

https://github.com/jupyterhub/wrapspawner/blob/2bb0e21b9e0b5d472b66ccf936e6e1f8df48ef6f/wrapspawner/wrapspawner.py#L488-L515

and the corresponding functions in ProfilesSpawner.

Some more minor issues:

I wonder if you would be open to changing the name of the class. CustomSpawner is very generic; something like LocalBatchTemplatesSpawner would be more informative. (That suggested name is off the top of my head; feel free to improve on it.)

In your __init__ function, you have stanzas like:

https://github.com/jupyterhub/wrapspawner/blob/2bb0e21b9e0b5d472b66ccf936e6e1f8df48ef6f/wrapspawner/wrapspawner.py#L410-L413

Jupyter uses the traitlets library to handle configuration. Unless you have a good reason to work around it, these variables should also be implemented as traits, and this logic would go in a default value function.

The help text should include documentation of exactly what an admin or user should put in these profile files. Even having read the code, I don't feel confident that I could create a valid file on the first try. It would also be acceptable to put a section in the README with some explanatory text and an example configuration.

Where you define the profiles trait:

https://github.com/jupyterhub/wrapspawner/blob/2bb0e21b9e0b5d472b66ccf936e6e1f8df48ef6f/wrapspawner/wrapspawner.py#L340-L347

The type signature is inconsistent with the help text. Also, given the assumptions made in the code, I am not sure the default value would even actually work using LocalProcessSpawner.

When you process the form, you have:

https://github.com/jupyterhub/wrapspawner/blob/2bb0e21b9e0b5d472b66ccf936e6e1f8df48ef6f/wrapspawner/wrapspawner.py#L447

I'm not sure what side effects might result from modifying the self.config object, but at minimum I don't think the batch_command value is going to be properly persisted in the class state. Admittedly the persistence isn't that important since you don't need the command to reconnect to the running job if the Hub restarts and has to restore state. Still, to be safe I'd pass this in the returned dictionary and have construct_child pull it out of self.user_options rather than stash it in the config object like this.

JeffEdwardsTech commented 2 years ago

Been working to address the changes! The following are complete:

I have a couple of discussion points I could use feedback on

  1. I used ProfilesSpawner as a example to write my own Spawner, so as you noted they look similar. However, my class has diverged pretty far from the ProfilesSpawner. I'm not sure the heart of what it does (load profiles from text files located on the filesystem, and/or the user's home directories) is enough like ProfilesSpawner to warrant a sub-class anymore
  2. I understand modify the self.config may not be the best way to affect the launch of the batch process. However, I've been searching for a better way to change the command actually launched by JupyterHub and as of yet, I've been unable to find the proper way. My instructions from my organization were to keep changes confined to the wrapspawner.py module so as not to affect the larger Jupyterhub codebase. Is there a correct way to change the batch command being run at process launch time?

Let me know on #1 and #2, more than happy to defer to your judgment and implement whatever further changes are needed.