Closed ellisonbg closed 4 years ago
Thanks for opening this!
Keycloak is pretty cool! I think the way to do this is to provide lots of documentation & maybe a cookiecutter with appropriate chart dependencies. I don't think we should include keycloak in z2jh - IMO, z2jh is already way too complex, and we need to spend time simplifying it. Sharing db between keycloak and hub is also probably not a good idea - hub is designed to work ok if you delete the db and restart the hub, but I don't think keycloak is.
See #575 for similar concerns.
Thanks @yuvipanda that is really helpful and makes a lot of sense. Seems like another repo with documentation + helm charts would be appropriate for that. As we begin working on this stuff, we will create a repo and once it gets to a useful point see if it makes sense to move to the jupyterhub org. This is part of ongoing work on large complex deployments with sensitive/confidential data. We are talking about also documenting the security compliance aspects as well and some of this stuff might fit well with that.
Would Keycloak provide JuputerHub and, thus, Z2JH with a foundation for a solution to real-time multi-authentication option (using multiple authentication mechanisms at the same time, e.g.: LDAP, GitHub, AD)?
I think keycloak does support that type of usage case.
On Mon, Sep 3, 2018 at 4:23 PM Aleksandr Blekh notifications@github.com wrote:
Would Keycloak provide J2JH with a foundation for a solution to real-time multi-authentication option (using multiple authentication mechanisms at the same time, e.g.: LDAP, GitHub, AD)?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/886#issuecomment-418204621, or mute the thread https://github.com/notifications/unsubscribe-auth/AABr0Gr_b2Uul1iIq-jLtuMj4BTs_pXoks5uXbnhgaJpZM4WSJd4 .
-- Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com
@ellisonbg Thank you. That's good to know.
We have a couple of z2jh + keycloak deployments, some with ldap. we also use keycloak groups and roles to manage resources quota and additional dataset mounts for teams. happy to contribute to docs.
@clkao Your contribution will be much appreciated. Some multi-authentication configuration examples would be especially valuable (please share them here, if you think it's too much for the documentation).
An alternative to Keycloak is Gluu.
It looks like a promising tool for identity and access management as well. Listing it here for reference.
Ohh, good find!
I'd be interested in seeing a metachart that utilizes z2jh + keycloak and configures them to work together. Something I've considered doing myself for a hobby project of deploying a JupyterHub for schools in Sweden, but not got around to doing it yet.
Give me 3 thumbs up and I'll start doing it hahaha =D Some expectation helps!
I took some time recently to research using Keycloak for a potential Z2JH's multi-authentication solution and have the following question. What would be the optimal or preferred approach: use Keycloak as an IdM solution for a Z2JH-deployed K8s cluster (via K8s RBAC) or for relevant Web application (JH, in this case)? In other words, what should be the preferred integration target: K8s cluster or Web application running on it?
@ablekh web application! I have ended up using Okta instead of Keycloak within my company, and I'm very pleased, except right now where I'm pulling my hair due to a bug, but... still very happy.
I don't fully grasp the alternative, but I'd say that KeyCloak or Okta or whatever acts as the Identity Provider (IdP) for the JH users, using OpenID Connect as an interface between JH and the IdP. And, it should not communicate with the k8s cluster itself or similar.
JupyterHub delegates user auth stuff -> KeyCloak as primary IdP -> could delegate to one of many secondary IdP etc
@consideRatio Thank you for your prompt clarification. AFAIK, Okta is a commercial solution, so it is a no-go for our use case. I will continue exploring a Keycloak-based solution and keep everyone posted (or ask additional questions ;-)).
@ablekh ah yeah, but note that you can use it for free unless you have more than 1000 unique logins a month, i think. I was able to configure a lot of things that was of use to me in Okta so I became quite happy with using it.
For example, i can move users to a group, and based on if they belong to a group, i can configure a "scope", which is something that the app authenticating with okta can request, and if they request a scope, they can be provided a "claim" like "gpu_access": true
.
@consideRatio Hmm ... I see. That is good to know, thanks! Though I prefer to avoid platforms that have these kinds of limits. IMO, even if those limits are not applicable to today's state of things, they will surely become an obstacle (or vendor lock-in) further down the road ...
I'm 100% with you @ablekh, it took a long time before I decided to lock in myself for the deployment I maintain for my company. I wonder how KeyCloak compares.
@consideRatio Thank you! I understand. I will keep you posted on my progress ... (obviously, since this is just one of many things I work (or plan to work) on, it will most likely be slower than I would like it to be). :-)
Two bits of experience I've gained from using KeyCloak: the documentation is pretty "enterprise-y" (I find it near to impossible to find stuff out quickly), there is a helm chart for deploying to k8s, it is hard to configure from a config file (some things just can't be done from a file :-/), it eats a lot of resources, it seems to work, hard to customise things (might be easier if you know Java).
I have https://github.com/ory/hydra on my list of things to try out.
I finally had some time to put together a meta chart that configures z2jh and keycloak. the required value file ain't pretty because helm's weird scoping for subcharts. i'll post it this week.
@betatim Thank you for sharing your experience and thoughts. I agree - Keycloak documentation is not for the faint of heart. :-) I ran across Hydra in my research as well, but Hydra is not an identity provider (IdP), but a login server (for a similar, but more lightweight/microservice-y, project, also see https://github.com/tarent/loginsrv). As you likely know, a - comparable to Keycloak - comprehensive IdP solution would be Gluu, but Keycloak seems more flexible, based on my initial look. The latter also has an advantage of IMO being more sustainable platform for the long term, due to the fact that it backed by Red Hat (that uses it in many/some of their products), whereas Gluu is a startup-based single-vendor backer.
P.S. I have recently tried to use Keycloak Helm chart on Azure (via Rancher, which I'm currently exploring as well), but experienced some issues (e.g., lack of external IP, even after enabling that option - or so I think).
@clkao That's great news! I look forward to checking it out and learning something new. ;-)
Hydra is not an identity provider (IdP)
Nods, the fact that you have to build some stuff yourself/already have it is a feature for me in that particular project.
OK, so while I'm happy to see an integrated Helm-based solution by @clkao, I decided to explore another, slightly less integrated, option for this, which should be more than enough for our current project. I installed dockerized Keycloak (BTW, I like it very much!) on a separate VM (on Azure) and, after some struggle, was able to make it work, for now in a simple, non-production configuration (embedded DB, lack of clustering).
Now, the question is how to implement automatic forwarding Keycloak-authenticated users' requests to JupyterHub and, at the same time, have its authentication disabled (obviously, allowing requests only originating from a Keycloak network/instance). So, after a bit of research (and skipping dummyauthenticator
and tmpauthenticator
as arguably unneeded extra layers), I ran across https://github.com/jupyterhub/jupyterhub/issues/1065 and https://github.com/jupyterhub/jupyterhub/pull/1066, where @minrk has recommended the use of a "RemoteUserAuthenticator" (I assume that it implies this: https://github.com/cwaldbieser/jhub_remote_user_authenticator). So, what exactly needs to be done in order to integrate a "Keycloak-on-a-separate-host" with a Z2JH cluster via this remote user authenticator for transparent multi-protocol and multi-IdP authentication?
Essentially, I want users, upon athentication with Keycloak, to be redirected to an spawner profile selection page (if possible to combine both wrapspawner
and batchspawner
options) or (if not possible) to a custom gateway page that would allow users to select their main destination: standard JupyterHub or HPC-enabled one. A third alternative (instead of the gateway page) would be to use Keycloak RBAC (roles or groups) in order to automatically redirect to relevant destination, based on a particular user's roles or groups. Obviosly, this approach is less flexible, since most typical use cases would imply users' freedom to choose type of workloads and their destinations. Nevertheless, there might be a need for option 3 as well. Please advise.
cc: @consideRatio @ellisonbg @yuvipanda @betatim @cwaldbieser @minrk
I would definitely not share a database. If deploying against mysql/postgres, then both talking to the same server (though still different "databases" on the same server) would be fine. I agree that this should probably be mostly a documentation project, and I don't have a good handle on whether a 'section' in z2jh or a new collection of docs is the right way to go.
I do think some more detailed, complete step-by-step user-stories along the lines of "This is a deployment with X and Y for Z" would be good for this purpose. A keycloak example would fit really well, here.
If there are changes to this chart, adding first-class auth.type: keycloak
support I think is appropriate. That should configure the GenericOAuthenticator with appropriate settings.
You generally put something like this: https://github.com/pusher/oauth2_proxy
in front of the service. Based on which service you go to, it takes you back there once authenticated. So its under the users control which resource to go to.
Thanks, Kevin
From: Aleksandr Blekh [notifications@github.com] Sent: Thursday, March 07, 2019 3:27 AM To: jupyterhub/zero-to-jupyterhub-k8s Cc: Fox, Kevin M; Manual Subject: Re: [jupyterhub/zero-to-jupyterhub-k8s] Discussion about keycloak integration? (#886)
OK, so while I'm happy to see an integrated Helm-based solution by @clkaohttps://github.com/clkao, I decided to explore another, slightly less integrated, option for this, which should be more than enough for our current project. I installed dockerized Keycloak (BTW, I like it very much!) on a separate VM (on Azure) and, after some struggle, was able to make it work, for now in a simple, non-production configuration (embedded DB, lack of clustering).
Now, the question is how to implement automatic forwarding Keycloak-authenticated users' requests to JupyterHub and, at the same time, have its authentication disabled (obviously, allowing requests only originating from a Keycloak network/instance). So, after a bit of research (and skipping dummyauthenticator and tmpauthenticator as arguably unneeded extra layers), I ran across jupyterhub/jupyterhub#1065https://github.com/jupyterhub/jupyterhub/issues/1065 and jupyterhub/jupyterhub#1066https://github.com/jupyterhub/jupyterhub/pull/1066, where @minrkhttps://github.com/minrk has recommended the use of a "RemoteUserAuthenticator" (I assume that it implies this: https://github.com/cwaldbieser/jhub_remote_user_authenticator). So, what exactly needs to be done in order to integrate a "Keycloak-on-a-separate-host" with a Z2JH cluster via this remote user authenticator for transparent multi-protocol and multi-IdP authentication?
Essentially, I want users, upon athentication with Keycloak, to be redirected to an spawner profile selection page (if possible to combine both wrapspawner and batchspawner options) or (if not possible) to a custom gateway page that would allow users to select their main destination: standard JupyterHub or HPC-enabled one. A third alternative (instead of the gateway page) would be to use Keycloak RBAC (roles or groups) in order to automatically redirect to relevant destination, based on a particular user's roles or groups. Obviosly, this approach is less flexible, since most typical use cases would imply users' freedom to choose type of workloads and their destinations. Nevertheless, there might be a need for option 3 as well. Please advise.
cc: @consideRatiohttps://github.com/consideRatio @ellisonbghttps://github.com/ellisonbg @yuvipandahttps://github.com/yuvipanda @betatimhttps://github.com/betatim @cwaldhttps://github.com/cwald
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/886#issuecomment-470490632, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABcWwvqCuMslQAG7V9O7w1Iv0aHnCUzFks5vUPetgaJpZM4WSJd4.
@ablekh I'd use Keycloak as a Open ID Connect provider, which builds on OAuth2, and configure JH to use it using the GenericOAuthenticator.
@kfox1111 @consideRatio Thank you so much for your helpful suggestions. It is good to know about oauth2_proxy
(starred/bookmarked). However, I will most likely prefer GenericOAuthenticator
due to potentially easier integration with JupyterHub infrastructure. In the meantime, I discovered this relevant, but still open, issue: https://github.com/jupyterhub/oauthenticator/issues/107. Will keep you posted about my Keycloak journey ... :-)
Hmmmm @ablekh I looked into the logout thing more in detail, I did not see any trace of the oauthenticator's generic Authenticator class supported redirecting to a providers logout endpoint, instead it will probably just log you in again automatically if you return as the cookie confirming you are signed in on keycloak will remain, so it will simply say to JH again soon thereafter that "ALL OKAY!".
So, what we want is for JH to be able to point to the logout endpoint of keycloak, that actually has one though. Looking into this further. Okay done so while the oauthenticator does not mention logout, it is founded on the Authenticator base class that defines a logout_url FUNCTION (not traitlet).
https://github.com/jupyterhub/jupyterhub/blob/master/jupyterhub/auth.py#L582-L595
So, perhaps the way to configure "deep signout" which then I mean "signout of keycloak" rather than "sign out of JH" this is to replace this function.
Hmmm... I think I may want to make a proper "OIDC" Authenticator for JupyterHub supporting all of this at once, by providing a very limited configuration that automatically configures itself using the ".well-known/openid-configuration" endpoint they have containing all info required to know about the authorize, logout, userinfo endpoints for example.
Creating a JH issue to represent this need.
@consideRatio I appreciate your additional insights. However, I'm a bit confused by some of them. Perhaps, I'm missing something, but, based on discussions in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/470 and https://github.com/jupyterhub/oauthenticator/issues/107, I thought that all we need is to implement a logout handler for to-be-created Keycloak authenticator (that would reside in keycloak.py
), similarly how JupyterHub Globus authenticator implements that functionality: https://github.com/jupyterhub/oauthenticator/blob/master/oauthenticator/globus.py#L35. So, using this approach, if I understand correctly, would not require any rework in general OAuth authenticator code that you suggested above. Hopefully, some other subscribers to this issue will chime in and clarify things for us (or just for me :-)).
Yepp that would do it, but im a bit unhappy about the proliferation of different handlers being created and would prefer to see a generic one with examples on how to use it with keycloak, auth0, okta, gluu, etc.
I figure ill investigate if we can configure a logout url without too much effort.
Gah this is complicated! Especially if you use keycloak/okta/gluu/auth0 as a proxy to other identity providers like Microsoft Azure AD or something. Mind exploding...
To be continued :D
@consideRatio Wow, wow -- take it easy! :-) Layering IdPs and IdMs is, indeed, quite complicated. I understand. And I agree that the more functionality we could extract into generic (in this case, OAuth authenticator's) code, the better. The problem is that I'm trying to have to some workable PoC solution relatively soon and don't know how long the suggested code refactoring might take. Keep us posted ... :-)
The auth flow for a user goes like this:
So now you are logged in to GitHub/KeyCloak and to the hub.
When you click "Logout" in the hub the session between you and the hub ends. If you attempt to login to the hub it sends you to GitHub/KC again to have your identity established. However your GitHub/KC session is still open/valid so you get redirected back to the hub straight away.
To me it isn't clear that in the general case "logout from the hub" should also end my GitHub/KC login session. You might want this or not.
Another aspect to consider is that you can configure the length of the GitHub/KC session. You could set it to be very short or require you to provide your password every time, etc.
It also means that if you logout from your GitHub session your hub session doesn't end.
There are two things to investigate:
refresh_user
functionality of the hub let you notice that the KC session has ended?@betatim Your clear comment is much appreciated. It is pretty close to what was my understanding, though it definitely was much less structured in my head before. :-) So, assuming that we are on the same page on the authentication flow, in response to "to me it isn't clear that in the general case "logout from the hub" should also end my GitHub/KC login session. You might want this or not" I suggest a simple and IMO the most effective solution: make this (and potentially other related options, as in your 1./2. above) configurable by JupyterHub cluster's administrator (on a per-cluster basis, which is probably the easiest way Helm-wise).
My inclination is that jupyterhub should not log someone out of keycloak or any other external OAuth provider. For example, Google allows you to sign-in to multiple accounts at the same time. I've been playing around with request-oauthlib recently for a different project, and the Google OIDC example includes the option to select one of multiple accounts (prompt="select_account"
). In this context signing out of Jupyter would allow you to change google accounts without having to login to the Google account again.
One thing I've seen on some other apps is that after you logout of the app you're given a link where you can log out of the external provider.
@manics The options that you're suggesting (default to not logging out of external OAuth IdP and providing a logout link) are IMO both feasible. However, I believe that the ultimate behavior should be driven by what specific use case's requirements (or preferences) are and, thus, should be controlled as an option configurable by cluster's administrator (see my comment above).
We currently logout keycloak for the users, since we enable auto_login
and the users actually never know about keycloak. This might not be the desired behaviour with other providers as discussed above.
To do with GenericAuthenticator you need to inject a custom handler:
class OIDCLogoutHandler(LogoutHandler):
kc_logout_url = '%s://%s/auth/realms/%s/protocol/openid-connect/logout' % (keycloak_scheme, keycloak_host, realm)
@gen.coroutine
def get(self):
# redirect to keycloak logout url and redirect back with kc=true parameters
# then proceed with the original logout method.
logout_kc = self.get_argument('kc', '')
if logout_kc != 'true':
logout_url = self.request.full_url() + '?kc=true'
self.redirect(self.kc_logout_url + '?' + urllib.parse.urlencode({ 'redirect_uri' : logout_url}))
else:
super().get()
It might worth making this an option in https://github.com/ausecocloud/keycloakauthenticator
@manics if you have an OIDC provider, you also get a "logout" endpoint from the ".well-known/openid-configuration" endpoint. This endpoint is just an URL you get a JSON response from about the login/logout endpoints etc for the OIDC provider.
If that is provided by default by the OIDC provider, I think it could make sense to consume it as default if we make a OIDC specific Authenticator.
@ablekh for a PoC, i'd use the Generic oauthenticator and then make sure the logout function of the base Authenticator class it derives from is configured/overridden to reference the relevant logout endpoint. This may require you to provide a extraConfig defining a new Authenticator class derived from the generic one and overriding the logout_url function and perhaps also the get_handlers function.
@manics Thank you for sharing your setup (I think I'd be fine with it, if auto_login = true
would make logout work as expected).
@consideRatio Thank you for sharing your advice - overriding base OAuth logout handler was my original plan (based on advice by @clkao I referenced earlier). However, if the solution suggested by @manics above will work as I expect, that would work fine for me, I think.
@ablekh I wonder how things are going. Are you still working with logout problem ? then May I ask example code or some advice for that?
@FCtj I have just recently started working on Keycloak integration, so no code yet exists that I could share. Unfortunately, I had to switch to working on other tasks. Will resume my Keycloak journey later this month (hopefully). Will keep everyone posted in this thread.
I have gathered a config that works for a JupyterHub (not on K8S) in https://github.com/datalayer/jupyterhub-oidc. The repos allows you to try it with a Keycloak container but the code should work with any other OIDC provider. I plan to make it work on K8S.
@echarles the link seems to be down
@gabrielpatricio moved to https://github.com/datalayer/datalayer/tree/master/lab/apps/jupyterhub-oidc
hi, i am not able to access.Can you pls provide the correct location.
Moved one folder above (will try to not change it anymore...) https://github.com/datalayer/datalayer/tree/master/lab/jupyterhub-oidc
thx
Hi Guys, I'm new to keycloak . anyway, i have integrated it with windows AD and now users are syncing from time to time. but my requirement is to connect the client Linux machine to this and login that client machine by using AD user credentials. Is it possible with keycloack
I'm closing this issue, overall I'd say we cannot manage to sustainably maintain documentation of keycloak integration with this Helm chart, other than to describe how to use OAuth2 to authenticate with a general identity provider, such as keycloak.
This week a few of us (@townsenddw @Zsailer @rafael-ladislau) began to experiment more with Keycloak (https://www.keycloak.org/). This is part of a new effort to improve JupyterHub's integration with directory services (ldap, AD) and enable groups to be provided by the directory service. Overall, our tests were very encouraging. Keycloak is easy to setup and very powerful and worked out of the box with the generic OAuth authenticator. It will make it very easy to bridge between other identity providers and ldap/AD and do custom mapping in that bridge (for example, assign docker image or instance types to users/groups).
Some questions:
@minrk @yuvipanda