dask / dask-labextension

JupyterLab extension for Dask
BSD 3-Clause "New" or "Revised" License
311 stars 62 forks source link

Authentication for dashboards #190

Open ian-r-rose opened 3 years ago

ian-r-rose commented 3 years ago

The problem

Broadly speaking, this extension works by passing around URLs:

  1. When a user has a cluster and clicks on the magnifying glass icon, it checks client.dashboard_link to find a URL for the dashboard, and uses that to fetch the individual plots endpoints and construct dashboard panes.
  2. If a user creates a cluster, then tries to insert client connection code for that cluster into a notebook or text file, it grabs the scheduler address off of the cluster, and gives you a snippet like: client = distributed.Client(<your-scheduler-address>).

This URL-based system has gotten us pretty far, and has been useful for a lot of people. Both of the above work fine if your clusters aren't protected by authentication or other security measures. But if that is not the true (often the case for a cloud or HPC based cluster), then it will fail. That is to say, Client("tls://some-url:8786") will not connect without an SSL context, and a dashboard at https://some-authenticated-service:8787/status won't connect without authorization headers/tokens.

Proposal

I don't think we will be able to (or will want to) cover authentication for all the manifold ways in which dask clusters are deployed today. But I think we can add an entrypoints-based plugin system to allow deployers of dask clusters to make sure that their services work with this package (similar to what @jacobtomlinson has done in dask-ctl, or what intake has done for drivers).

At it's most basic, I'm imagining a simple interface to allow library authors to say whether a given cluster belongs to them, and authenticate with their service as necessary. The basic flow:

Connecting to a dashboard

  1. Given a dashboard URL, ask an entrypoint provider "does this belong to you?"
  2. If yes, pass the provider a tornado request object for a dashboard plot and allow them to do whatever to it (probably add auth headers).
  3. Make the request as before.

Cluster connection

The ClusterManager already sends information about a cluster to the frontend to show in the side pane. So it (or whatever replaces it, cf. #189 ) could

  1. Ask an entrypoint provider "does this belong to you?"
  2. If so, get a snippet for connecting to the cluster to send to the frontend.
  3. That snippet can be inserted in to notebooks/text-files as before.

Dask Gateway

dask-gateway also has functionality around authenticating and proxying remote clusters. Is it possible to adopt it as a dependency to handle more than what we do here (see some discussion in #135) ? My instinct is that we don't want to require dask-gateway clusters in all circumstances, and that it won't be able to cover all the use cases we want to be able to handle here (and by "handle" I mean, let plugin authors handle). The converse of having dask-gateway implement a plugin here, on the other hand (fixing #135), should be fairly straightforward. But I'm happy to get pushback on that if others disagree. Especially curious what @rabernat and @scottyhq think.

scottyhq commented 3 years ago

Thanks for opening the discussion @ian-r-rose , I'm not sure I have much to contribute here, we've leaned heavily on @consideratio and @TomAugspurger to figure out dask-gateway configurations and compatibility fixes for the pangeo jupyterhubs over the last year. I can envision a future where it's more common for jupyterhub users to connect to coiled-managed clusters instead of the integrated daskhub solution (@cspencerjones has been doing this... and perhaps could comment on how the labextension works currently for jupyterhub+coiled?).

cspencerjones commented 3 years ago

Good question: honestly I usually just open the dashboard link in a new browser tab (I used to use the labextension but these kinds of issues have deterred me). But I just tried to use the dask labextension from the pangeo aws cluster and no, it doesn't seem to work with coiled: I can open windows from the labextension but they are all blank.

costrouc commented 2 years ago

Hey all wanted to give some context on the issue we are getting when using the dask-labextension in QHub. We are using forward authentication with single sign-on to protect the dask gateway dashboard urls that are exposed. This makes it so that within the browser the user has to login to keycloak via an oauth flow. This way we can protect all public urls even if the underlying service does not provide auth.

In our case the browser has the correct headers/cookies/tokens such that any browser request is able to access the dashboard. However in the case of the extension as it is https://github.com/dask/dask-labextension/blob/main/dask_labextension/dashboardhandler.py#L30 a request is sent from the jupyterlab server to check for the url. As far as I know it would be quite difficult to provide the headers etc to the server and has some security risks as well.

Is it possible for all these check to be done in the javascript side of the extension? Does the dask-labextension need the python bits?

dharhas commented 2 years ago

@ian-r-rose Do you have any thoughts here. We are interesting in helping but I'm unsure how much effort this is or how it should be done.

dharhas commented 2 years ago

@rsignell-usgs this is the core issue making the lab-extension non-functional in QHub.

ian-r-rose commented 2 years ago

Sorry for the slow response @costrouc! (and :wave: @dharhas )

The biggest constraint here is that, as far as I know, the bokeh server that hosts the dashboard doesn't provide a way to set CORS headers (see discussion here). Historically, that has meant that this extension has had to jump through all sorts of hoops to reason about what might be on the other side of a given URL, including maintaining its own list of dashboard endpoints and this horrific hack involving sniffing out a static png to determine "alive-ness".

Those workarounds proved difficult to maintain, as both the static contents could change, as could the list of possible dashboards. Even worse, since the list of dashboards isn't even the same between distributed versions, it is impossible to know for sure whether the list of dashboards and their endpoints is correct, and things would randomly be broken depending on the user environment.

So in #159 I moved the dashboard check to the server side so that we could always check the individual-plots.json endpoint for whether the dashboard is live, and what it was hosting. But this proved to have some consequences for dashboards that are authenticated using a cookie/token, as @costrouc points out. I would definitely like to fix this use-case, but I would want to do it in such a way that means we don't have to restore the above workarounds. To me this means either:

  1. Configuring the bokeh server so that it can accept requests from the user's browser session. I haven't followed bokeh too closely recently, and I don't know what the latest is there.
  2. Pushing on the above implementation proposal for forwarding a session token or cookie to the dask-labextension server side so that it can make auth'd checks to the individual-plots.json endpoint. Though if there security-related concerns with this approach, I'd certainly be interested in hearing about them.
costrouc commented 2 years ago

@dharhas and @ian-r-rose in our case the dask dashboards are served within the same dns hostname so CORS would not be an issue as far as I know. Would this make the problem any easier to solve?

ian-r-rose commented 2 years ago

@dharhas and @ian-r-rose in our case the dask dashboards are served within the same dns hostname so CORS would not be an issue as far as I know. Would this make the problem any easier to solve?

Quite possibly. It should be fairly easy to check if you are comfortable building from a branch. You could add a new branch to this block which checks if the hostname is the same, and if so, makes the request for individual-plots.json directly from the browser session using fetch(). https://github.com/dask/dask-labextension/blob/9433ecd21842417dda1579b78127dd4babfb8a4e/src/dashboard.tsx#L579-L623

If that does indeed fix the issue, I'd be happy to have that be a config option with sensible defaults.

costrouc commented 2 years ago

@ian-r-rose thank you for the detailed responses and this is helpful! We will be tasking someone to work on this feature.

viniciusdc commented 2 years ago

Hi, @ian-r-rose thanks for the links and the detailed explanation, just a quick question the above snippet will do the work from the browser perspective, but wouldn't the test fails on the server-side based on this code block? https://github.com/dask/dask-labextension/blob/9433ecd21842417dda1579b78127dd4babfb8a4e/dask_labextension/dashboardhandler.py#L18

ian-r-rose commented 2 years ago

@viniciusdc Yeah, the possible fix I suggested above is to avoid the DaskDashboardCheckHandler all-together when we expect the browser request to succeed.

rsignell-usgs commented 2 years ago

Ooh, this sounds promising. Thanks for looking into this. We love the dask-labextension in qhub! 🤞

jacobtomlinson commented 2 years ago

cc @bryevdv who might be able to help here with the Bokeh related challenges.

viniciusdc commented 2 years ago

Hi folks, my apologies for the late reply on this. I was able to fix our issue with authentication following the changes @ian-r-rose proposed (Many thanks !!!).

ian-r-rose commented 2 years ago

Great! Would love to see a PR @viniciusdc!

viniciusdc commented 2 years ago

Great! Would love to see a PR @viniciusdc!

HI, @ian-r-rose thanks!! I just submitted a PR, hope you could have a look.

dchudz commented 1 year ago

@ian-r-rose is this something you're still interested in?

The dashboards for Coiled clusters are now SSL and behind a token authentication (like https://cluster-abcde.dask.host/status?token=SOME_TOKEN), so at the moment Coiled doesn't work with dask-labextension.

ian-r-rose commented 1 year ago

@dchudz, yes, I am still interested in this (though have limited time right now).

I had a branch with a proof-of-concept for this lying around somewhere, though cannot find it right now, I feel it might be lost to the sands of time... But basically, the approach is to

  1. Either resolve this issue upstream or patch it in this package. My proof-of-concept didn't do anything tricky with custom certs, it just allowed https URLs to be proxied and let the browser handle cert checking.
  2. A coiled cluster (or other cluster provider) would then provide URLs with a token appended as a query parameter.
  3. This package does some manual URL construction to point at the various individual dashboard pages. That would probably have to be updated to keep any query parameters around during that process.
Ademord commented 1 month ago

Hello, i am here interested in showing my cluster in the labextension plugin but I use TLS... apparently that is a problem. Can anyone provide an update on this?