Open ian-r-rose opened 3 years ago
Thanks for opening the discussion @ian-r-rose , I'm not sure I have much to contribute here, we've leaned heavily on @consideratio and @TomAugspurger to figure out dask-gateway configurations and compatibility fixes for the pangeo jupyterhubs over the last year. I can envision a future where it's more common for jupyterhub users to connect to coiled-managed clusters instead of the integrated daskhub solution (@cspencerjones has been doing this... and perhaps could comment on how the labextension works currently for jupyterhub+coiled?).
Good question: honestly I usually just open the dashboard link in a new browser tab (I used to use the labextension but these kinds of issues have deterred me). But I just tried to use the dask labextension from the pangeo aws cluster and no, it doesn't seem to work with coiled: I can open windows from the labextension but they are all blank.
Hey all wanted to give some context on the issue we are getting when using the dask-labextension in QHub. We are using forward authentication with single sign-on to protect the dask gateway dashboard urls that are exposed. This makes it so that within the browser the user has to login to keycloak via an oauth flow. This way we can protect all public urls even if the underlying service does not provide auth.
In our case the browser has the correct headers/cookies/tokens such that any browser request is able to access the dashboard. However in the case of the extension as it is https://github.com/dask/dask-labextension/blob/main/dask_labextension/dashboardhandler.py#L30 a request is sent from the jupyterlab server to check for the url. As far as I know it would be quite difficult to provide the headers etc to the server and has some security risks as well.
Is it possible for all these check to be done in the javascript side of the extension? Does the dask-labextension need the python bits?
@ian-r-rose Do you have any thoughts here. We are interesting in helping but I'm unsure how much effort this is or how it should be done.
@rsignell-usgs this is the core issue making the lab-extension non-functional in QHub.
Sorry for the slow response @costrouc! (and :wave: @dharhas )
The biggest constraint here is that, as far as I know, the bokeh server that hosts the dashboard doesn't provide a way to set CORS headers (see discussion here). Historically, that has meant that this extension has had to jump through all sorts of hoops to reason about what might be on the other side of a given URL, including maintaining its own list of dashboard endpoints and this horrific hack involving sniffing out a static png to determine "alive-ness".
Those workarounds proved difficult to maintain, as both the static contents could change, as could the list of possible dashboards. Even worse, since the list of dashboards isn't even the same between distributed
versions, it is impossible to know for sure whether the list of dashboards and their endpoints is correct, and things would randomly be broken depending on the user environment.
So in #159 I moved the dashboard check to the server side so that we could always check the individual-plots.json
endpoint for whether the dashboard is live, and what it was hosting. But this proved to have some consequences for dashboards that are authenticated using a cookie/token, as @costrouc points out. I would definitely like to fix this use-case, but I would want to do it in such a way that means we don't have to restore the above workarounds. To me this means either:
dask-labextension
server side so that it can make auth'd checks to the individual-plots.json
endpoint. Though if there security-related concerns with this approach, I'd certainly be interested in hearing about them.@dharhas and @ian-r-rose in our case the dask dashboards are served within the same dns hostname so CORS would not be an issue as far as I know. Would this make the problem any easier to solve?
@dharhas and @ian-r-rose in our case the dask dashboards are served within the same dns hostname so CORS would not be an issue as far as I know. Would this make the problem any easier to solve?
Quite possibly. It should be fairly easy to check if you are comfortable building from a branch. You could add a new branch to this block which checks if the hostname is the same, and if so, makes the request for individual-plots.json
directly from the browser session using fetch()
. https://github.com/dask/dask-labextension/blob/9433ecd21842417dda1579b78127dd4babfb8a4e/src/dashboard.tsx#L579-L623
If that does indeed fix the issue, I'd be happy to have that be a config option with sensible defaults.
@ian-r-rose thank you for the detailed responses and this is helpful! We will be tasking someone to work on this feature.
Hi, @ian-r-rose thanks for the links and the detailed explanation, just a quick question the above snippet will do the work from the browser perspective, but wouldn't the test fails on the server-side based on this code block? https://github.com/dask/dask-labextension/blob/9433ecd21842417dda1579b78127dd4babfb8a4e/dask_labextension/dashboardhandler.py#L18
@viniciusdc Yeah, the possible fix I suggested above is to avoid the DaskDashboardCheckHandler
all-together when we expect the browser request to succeed.
Ooh, this sounds promising. Thanks for looking into this. We love the dask-labextension in qhub! 🤞
cc @bryevdv who might be able to help here with the Bokeh related challenges.
Hi folks, my apologies for the late reply on this. I was able to fix our issue with authentication following the changes @ian-r-rose proposed (Many thanks !!!).
Great! Would love to see a PR @viniciusdc!
Great! Would love to see a PR @viniciusdc!
HI, @ian-r-rose thanks!! I just submitted a PR, hope you could have a look.
@ian-r-rose is this something you're still interested in?
The dashboards for Coiled clusters are now SSL and behind a token authentication (like https://cluster-abcde.dask.host/status?token=SOME_TOKEN
), so at the moment Coiled doesn't work with dask-labextension.
@dchudz, yes, I am still interested in this (though have limited time right now).
I had a branch with a proof-of-concept for this lying around somewhere, though cannot find it right now, I feel it might be lost to the sands of time... But basically, the approach is to
Hello, i am here interested in showing my cluster in the labextension plugin but I use TLS... apparently that is a problem. Can anyone provide an update on this?
The problem
Broadly speaking, this extension works by passing around URLs:
client.dashboard_link
to find a URL for the dashboard, and uses that to fetch the individual plots endpoints and construct dashboard panes.client = distributed.Client(<your-scheduler-address>)
.This URL-based system has gotten us pretty far, and has been useful for a lot of people. Both of the above work fine if your clusters aren't protected by authentication or other security measures. But if that is not the true (often the case for a cloud or HPC based cluster), then it will fail. That is to say,
Client("tls://some-url:8786")
will not connect without an SSL context, and a dashboard athttps://some-authenticated-service:8787/status
won't connect without authorization headers/tokens.Proposal
I don't think we will be able to (or will want to) cover authentication for all the manifold ways in which dask clusters are deployed today. But I think we can add an entrypoints-based plugin system to allow deployers of dask clusters to make sure that their services work with this package (similar to what @jacobtomlinson has done in
dask-ctl
, or whatintake
has done for drivers).At it's most basic, I'm imagining a simple interface to allow library authors to say whether a given cluster belongs to them, and authenticate with their service as necessary. The basic flow:
Connecting to a dashboard
Cluster connection
The
ClusterManager
already sends information about a cluster to the frontend to show in the side pane. So it (or whatever replaces it, cf. #189 ) couldDask Gateway
dask-gateway
also has functionality around authenticating and proxying remote clusters. Is it possible to adopt it as a dependency to handle more than what we do here (see some discussion in #135) ? My instinct is that we don't want to require dask-gateway clusters in all circumstances, and that it won't be able to cover all the use cases we want to be able to handle here (and by "handle" I mean, let plugin authors handle). The converse of having dask-gateway implement a plugin here, on the other hand (fixing #135), should be fairly straightforward. But I'm happy to get pushback on that if others disagree. Especially curious what @rabernat and @scottyhq think.