jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.51k stars 383 forks source link

Make BinderSpawner a real class #1015

Closed manics closed 4 years ago

manics commented 4 years ago

As discovered in https://github.com/jupyterhub/binderhub/pull/1010 BinderSpawner is defined in-line in three different places:

We could reduce duplication in this repo by making it a class, e.g. in binder/spawner.py. As part of this a parameter enable_auth: bool could be added so it's easier to configure

manics commented 4 years ago

I tried this in https://github.com/manics/binderhub/commit/b164ea9b5c4ba7a614a46b4e849fc07d80615c69 (branch https://github.com/manics/binderhub/tree/binderspawner) and realised this won't work, it has to go into the JupyterHub Docker image. Options I can think of:

I think one of the first two options are best, what does everyone else think?

bitnik commented 4 years ago

I just have another idea. It is a different approach but I think benefits would be same in the end. And I am not sure yet if it is good or bad, but I just want to share it quickly.

What about if we define binder_spawner.py and mount is as configmap into /etc/jupyterhub/binder_spawner.py in hub container as @consideRatio did for z2jh.py, cull_idle_servers.py and jupyterhub_config.py in z2jh chart. Here is a quick example commit: https://github.com/gesiscss/binderhub/commit/9142f4e340446e4f2b085a646df809bbd22db5a5

But there is something I am not sure. Because I used jupyterhub.hub.extraVolumes and jupyterhub.hub.extraVolumeMounts in binderhub. I don't know what happens if someone uses binderhub chart and defines them in their own configuration. Would they be merged or the ones I defined would be overridden?

consideRatio commented 4 years ago

Hmmmm, @bitnik an array would be replaced, a dictionary would be merged I think. Hmm... I figure you need to use hub.extraVolumes and hub.extraVolumeMounts directly and potentially conflict with binderhub users that override it.

=) I see that our Jupyter friend has already found this exact issue before, see: https://github.com/helm/helm/issues/3767

consideRatio commented 4 years ago

I've been processing this a while now and don't come up with a nice solution I really like.

pip install binderhub in the Z2JH image just to make BinderSpawner(s) available This option is not viable at all I think, because then an improvement to the BinderSpawner for use by binderhub would go like:

  1. The binderhubspawner package get a new release
  2. The z2jh image bumps the new binderhubspawner package
  3. The binderhub bumps the z2jh chart as a whole

But, an option, especially since the changes @bitnik mentioned in https://github.com/jupyterhub/binderhub/issues/1015#issuecomment-561264578 about the hub moving away things to configmaps that are mounted on the hub pod, you could build your own BinderHub hub-image, that is very viable but adds some workload.

If that is done, that could import a standalone binderspawner.py or similar next to the other resources like z2jh.py.

Add BinderSpawner(s) to kubespawner

Same issues as above.

Add a binderhub-jupyterhub Docker image in this repo, building on top of the Z2JH one I consider this the only viable option among those three.


I've considered "could you not get that extraConfig extracted out of the default values.yaml? Hmmm well no I don't think so, you cannot import a file from .yaml, you can in a helm template, but this needs to happen earlier for this to be extracted.

betatim commented 4 years ago

One thing to consider: this code gets changed (very) rarely. To me this means two things:

Maybe we can solve some of the tediousness of the duplication with a comment that points to the duplicated places. We could also put a note in the copy in the docs saying "this is just an example, take these parts we have here and apply them to your version of the spawner/default spawner".

I'd vote against a process that involves a new package or having to release one of the existing packages/charts (like z2jh). I am not against building our own hub image but I am not at all excited about it. Wary of the tasks that it would introduce like keeping it compatible with the z2jh chart version we use. This seems like a task that would occur frequently compared to the task of copying stuff around when we do change BinderSpawner.

betatim commented 4 years ago

Could we make an new entry in extraConfig a bit like so for the testing use case as well as the auth example:

hub:
  extraConfig: 
    99-customBinder: |
      class MyBinderSpawner(BinderSpawner):
        # some custom stuff here but reusing most of BinderSpawner

(maybe with renaming the current binder key to 00-binder if getting the order right is required)

manics commented 4 years ago

If the duplication was only within this repo I think adding a comment and treating it as an internal maintenance issue would be fine, but it affects ease of use as well:

Documenting multiple entry keys in extraConfig and updating the docs with instructions on extending BinderSpawner might work. @consideRatio are the extraConfig keys guaranteed to be ordered in the templated output?

consideRatio commented 4 years ago

@manics i recall that they are sorted alphabetically or loaded alphabetically yeah. Let's see:

https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/edd326969ba9ec01f7ff28a3c03085a3bb4821b8/jupyterhub/files/hub/jupyterhub_config.py#L477-L479

Yepp, sorted. So note that zz- is a better prefix than 99- btw.