dask / dask-labextension

JupyterLab extension for Dask
BSD 3-Clause "New" or "Revised" License
311 stars 62 forks source link

Clicking on a cluster populates the Dashboard URL search bar with an unexpected URL #203

Open consideRatio opened 3 years ago

consideRatio commented 3 years ago

What happened:

Pressing the cluster in the dask-labextension panel on the left side of JupyterLab 3 provided me with a unknown URL that doesn't seem to relate to the dashboard URL of the cluster in any way.

image

dask-labextension-press-vs-copy-paste

My wish

I wish that the same Dashboard URL declared for the cluster is populating the search bar when I press the cluster.

I see three variations that could happen when I press the cluster.

  1. dask/dashboard/ee31a59b-71a8-4464-86c8-7cc8df558497 --- I find this strange, where did this come from? This is the current behavior
  2. /services/dask-gateway/clusters/prod.f9abd4530de44f958847aed333400ff2/status --- This is what I expected, but this doesn't work as it doesn't know about the domain name etc.
  3. https://hub.jupytearth.org/services/dask-gateway/clusters/prod.f9abd4530de44f958847aed333400ff2/status --- This is what I'd truley want, but think may be out of scope for this issue.

Minimal Complete Verifiable Example:

I don't know how to provide this =/

Environment:

JupyterHub (1.1.1 Helm chart) + Dask-Gateway (0.9.0 Helm chart).

$ conda list | grep dask
dask                      2021.6.0           pyhd8ed1ab_0    conda-forge
dask-core                 2021.6.0           pyhd8ed1ab_0    conda-forge
dask-gateway              0.9.0            py38h578d9bd_0    conda-forge
dask-glm                  0.2.0                      py_1    conda-forge
dask-kubernetes           2021.3.1           pyhd8ed1ab_0    conda-forge
dask-labextension         5.0.2              pyhd8ed1ab_0    conda-forge
dask-ml                   1.9.0              pyhd8ed1ab_0    conda-forge
pangeo-dask               2021.06.05           hd8ed1ab_0    conda-forge
$ python --version
Python 3.8.10

Operating System: Ubuntu 20.04 Install method: conda-forge

# The current environment and dask configuration via environment
DASK_DISTRIBUTED__DASHBOARD_LINK=/user/{JUPYTERHUB_USER}/proxy/{port}/status
DASK_GATEWAY__ADDRESS=http://10.100.116.39:8000/services/dask-gateway/
DASK_GATEWAY__AUTH__TYPE=jupyterhub
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
DASK_GATEWAY__PROXY_ADDRESS=gateway://traefik-prod-dask-gateway.prod:80
DASK_GATEWAY__PUBLIC_ADDRESS=/services/dask-gateway/
DASK_LABEXTENSION__FACTORY__CLASS=GatewayCluster
DASK_LABEXTENSION__FACTORY__MODULE=dask_gateway
DASK_ROOT_CONFIG=/srv/conda/etc
ian-r-rose commented 3 years ago

The URL that is populated when you click on a cluster is a real one: it's proxied under the Jupyter server using some custom jupyter-sever-proxy logic. The route is set up automatically from the dashboard URL provided by the scheduler. I know that it has worked with various kubernetes, MPI, and local clusters, but dask-gateway has always been a bit of an awkward fit, since it has its own proxying and authentication logic (see #135 for some discussion).

The history here is that the labextension started out by just iframing dashboard URLs directly, and it had no server-side logic. This is still what's done when you paste a URL directly into the box. This wound up being pretty limited, and relatively simple deployments wouldn't work due to mixed-content errors or inability for the browser client to see a URL. So we built the cluster manager interface, which would launch clusters directly and know how to proxy them. But dask gateway didn't really fit well into this model.

I think the longer-term solution is to actually proxy more things under the jupyter server, and teach the labextension to add authentication where appropriate. Otherwise supporting the myriad ways clusters might be exposed to the user is just too difficult. One of the key use-cases for this would be a JupyterHub+dask-gateway deployment, though I don't have good access to one right now (perhaps you'd be interested in adding me to jupytearth @consideRatio ? :) )

I have a proposal for a way forward in #190, I'd encourage you to weigh in there as well.

To look at your specific examples:

1. `dask/dashboard/ee31a59b-71a8-4464-86c8-7cc8df558497` --- I find this strange, where did this come from? This is the current behavior

This is the auto-constructed route attempting to proxy your dashboard URL under the Jupyter server. However, as you note the dashboard URL is missing a domain, and the extension doesn't know to add it. We may be able to do some logic to add the right domain when checking the URL for a dashboard. I haven't looked into it recently, so I'm not sure how easy this would be.

2. `/services/dask-gateway/clusters/prod.f9abd4530de44f958847aed333400ff2/status` --- This is what I expected, but this doesn't work as it doesn't know about the domain name etc.

This should be the same as above, but not proxied under the jupyter server.

3. `https://hub.jupytearth.org/services/dask-gateway/clusters/prod.f9abd4530de44f958847aed333400ff2/status` --- This is what I'd truley want, but think may be out of scope for this issue.

As far as I can tell, you are setting the route /services/dask-gateway in the config. I'd be curious if we could get this to work just by adding the domain to that. I think a blocker for that right now is that the jupyter-server-proxy still doesn't support HTTPS urls, though a relatively lightweight fix would be to just add some targeted "s"s where appropriate.

consideRatio commented 3 years ago

@ian-r-rose thanks for the thorough reply! I'll do some exploring with these insights and follow up with you at the end of this week. I've added your GitHub handle to have access to the deployment at https://hub.jupytearth.org that has dask-labextension + dask-gateway up and running like described in this issue.

thomafred commented 3 years ago

This is already handled for cluster dashboard URLs: https://github.com/dask/dask-labextension/blob/main/src/clusters.tsx#L79

      if (cluster.dashboard_link.indexOf(proxyPrefix) !== -1) {
        // If the dashboard link is already proxied using
        // jupyter_server_proxy, don't proxy again. This
        // can happen if the user has overridden the dashboard
        // URL to the jupyter_server_proxy URL manually.
        options.setDashboardUrl(cluster.dashboard_link);
      } else {
        // Otherwise, use the internal proxy URL.
        options.setDashboardUrl(`dask/dashboard/${cluster.id}`);
      }

Could this be a viable quick-fix for the dashboard URL search bar too?