dask / dask-labextension

JupyterLab extension for Dask
BSD 3-Clause "New" or "Revised" License
311 stars 62 forks source link

Support for Dask Gateway clusters from config #135

Open TomAugspurger opened 4 years ago

TomAugspurger commented 4 years ago

Right now, IIUC, to create a cluster using the button the config takes a python class, args, and kwargs to create the cluster. This isn't flexible enough for dask-gateway, which requires creating an intermediate Gateway object.

Two options

  1. Expand the logic of the lab extension's cluster creation to take a snippet of code to run.
  2. Update dask-gateway to have a "simple" way of creating a cluster that just uses the defaults (cc @jcrist).

https://github.com/dask/dask-gateway/issues/55 is related, but more focused on expanding dask-labextension to take advantage of dask-gateway

jcrist commented 4 years ago

dask-gateway doesn't require creating an intermediate Gateway object - you can already call dask_gateway.GatewayCluster directly (we might want to update our docs to better show this). I've verified things work with dask-labextension with default parameters just like any other cluster.

TomAugspurger commented 4 years ago

Thanks for the info Jim.

TomAugspurger commented 4 years ago

@jcrist one issue with this still, the code generated to connect the client is something like

from dask.distributed import Client

client = Client("gateway://traefik-gcp-uscentral1b-prod-dask-gateway.prod:80/prod.bb16cdceacd541089ac9d7288d717595")
client

but when auth is enabled, you don't have the security object and so that raises

TypeError: Gateway expects a `ssl_context` argument of type ssl.SSLContext, instead got None

Do you or @ian-r-rose have any guesses on if that can be supported? Nothing comes to my mind immediately.

ian-r-rose commented 4 years ago

@TomAugspurger good question. I don't think there is a good way to do this right now from the labextension side. As you point out, the code template to generate a client connection is pretty dumb: https://github.com/dask/dask-labextension/blob/46cbdc102412b98ac8d67f9a8abed2ca3b332a8d/src/index.ts#L566-L571

If the client has a way to pick up an SSL key from the environment context, that would be best from my perspective. Otherwise, we may need to teach the cluster representation about auth. The current typings for the model are specified here: https://github.com/dask/dask-labextension/blob/46cbdc102412b98ac8d67f9a8abed2ca3b332a8d/src/clusters.tsx#L813-L857 so only the address is tracked at the moment.

jcrist commented 4 years ago

Could we make the template configurable and formatted on the server side? Then it could use Gateway.connect instead, which might be cleaner. Would need the cluster variable in the template format, but that's about it.

ian-r-rose commented 4 years ago

@jcrist yes, that would be doable. Would you envision users setting the template in their config, or adding some kind of entrypoint to dask-gateway? At that point, I wonder if it would also be worthwhile to bite the bullet and special case dask-gateway to use it for cluster discovery and management (at least optionally).

jcrist commented 4 years ago

I was thinking this would be part of the user-side config for dask-labextension. I do think in the long run we'll want to special-case dask-gateway for the lab extension, but exposing the template to the user will resolve this issue, and feels like a useful thing to do generally (there's other kwargs the user might potentially want to set as well).

thomafred commented 3 years ago

I see that @consideRatio found a fix using the GatewayCluster-class (https://github.com/dask/dask-labextension/issues/203). Confirmed to work with the daskhub helm-chart:

jupyterhub:
  hub:
    extraConfig:
      10-patch-dask-labextension-config: |-
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__MODULE", "dask_gateway")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__CLASS", "GatewayCluster")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__ARGS", "[]")
        c.KubeSpawner.environment.setdefault("DASK_LABEXTENSION__FACTORY__KWARGS", "{}")

However, the DASK DASHBOARD URL is not set correctly as mentioned by Erik in the issue above.