2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 63 forks source link

Simplify enabling / disabling dask-gateway for a hub #362

Closed yuvipanda closed 11 months ago

yuvipanda commented 3 years ago

While most hubs we offer do not need dask-gateway, some do. We want to support this in the easiest way possible. Currently, this is done via the daskhub chart in hub-templates/daskhub, which has a dependency on dask-gateway and hub-templates/base-hub. This lets us put dask specific config in hub-templates/daskhub/values.yaml.

This is extreme overkill, and causes problems in our deploy scripts - look at the amount of special casing for daskhub in deploy/. The only things the hub-templates mechanism offers is:

  1. Conditional installation of a helm chart
  2. Conditional setting of large swaths of config (currently in hub-templates/daskhub/values.yaml.

There should be simpler ways to accomplish both of these without resorting to a full extra chart.

yuvipanda commented 3 years ago

I've removed the ephemeral-hub template in https://github.com/2i2c-org/pilot-hubs/pull/361 as a precursor to this.

yuvipanda commented 3 years ago

My current idea is to turn our YAML config into jsonnet, a non-turing complete, expressive configuration language that is JSON native. I quite like it, and it was built primarily to generate k8s objects and similar deeply nested JSON structures - and works well for that. There are python bindings as well.

Particularly, it enables easy object overlays, one of the issues we have with our config.

However, it is a new language and adds complexity - I'd prefer to just leave things in python if we can.

damianavila commented 3 years ago

In this case, I think the cost of adopting a new language is less than the complexity around the hierarchical structure and all the specific code to handle that special case.

yuvipanda commented 3 years ago

Thinking of dask as an additional service we want to deploy with JupyterHub, BinderHub is too! And right now it's difficult to mix and match since we use inheritance of charts, rather than composing them. For example, https://github.com/jupyterhub/binderhub/blob/0b4462c0b1fb15e355b32b38b99c60f440666e94/helm-chart/binderhub/values.yaml#L56 is the jupyterhub block in binderhub, some that conflicts with the jupyterhub block in the dask hub chart. But there's no reason they need to conflict, and if we had a better way of composing them, it will be all good. We'll end up with linear complexity rather than the combinatorial one we have now.

I think @jacobtomlinson talked about this too, and so did @rabernat

choldgraf commented 3 years ago

if we had a better way of composing them, it will be all good.

This feels like a good summary of a good chunk of the reason for https://github.com/jupyterhub/team-compass/issues/382

Agreed that making this more straightforward would be very impactful, "composability" and "modularity" seems like things we should be striving for!

choldgraf commented 3 years ago

Hey all - I am going to remove this from the activity board since it feels like a more complex multi-week project rather than a 2-3 day project.

consideRatio commented 11 months ago

I'll go for a close here in favor of #3252 linking back to this. Its a big deal for us to transition to put dask-gateway under basehub, so it is an issue more narrowly focused on initial steps towards that.