dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 718 forks source link

Proxy dashboard through to the client #6734

Open mrocklin opened 2 years ago

mrocklin commented 2 years ago

Sometimes getting access to the remote scheduler is hard, for example for security reasons. We could consider having the Client run an HTTP server and forwarding requests for a dashboard over to the scheduler through comms somehow (which are often set up in a secure way).

I don't know enough about web things to know if this is easy.

cc @jacobtomlinson @ian-r-rose @graingert

jacobtomlinson commented 2 years ago

I could see this being useful but I also wonder how often this will come up in terms of the scheduler comm being available but the scheduler dashboard not. It is probably more likely that the scheduler is totally not available to the user and the client is being created and accessed remotely via SSH or Jupyter, which presents similar but different challenges (xref #6736).

Assuming we want to go down this road there would be some considerations to make. The client doesn't start an HTTP server now so that would be an interesting change to make. Which port it should listen on? How would this be communicated in places like the dashboard_link property and the cluster widgets?

For implementation I expect we could just create an HTTP handler on the new client HTTP server that uses run_on_scheduler to make the same request on the scheduler and return the result. Supporting websockets could be more challenging (but necessary for Bokeh) because it's not immediately obvious how run_on_scheduler would handle long running iterators like that.

jacobtomlinson commented 2 years ago

I just had another use case for this come up.

I can use the Jupyter proxy to view the dashboard via Jupyter but only if I add the hostname of the dask scheduler to the allowlist in my Jupyter proxy config, which I may not have easy control over if someone else set up Jupyter for me.

By proxying the dashboard via the Client there would be a dashboard endpoint available at localhost (relative to the Jupyter server) which is already on the allowlist and would "just work" with the Jupyter proxy.