Closed yuvipanda closed 5 months ago
All this functionality would be extremely useful to all the authenticators, so should be in base.
I think an important question now becomes - what is the point of GenericOAuthenticator? Is that just here for backwards compatibility?
I think the answer should be 'yes'. And in the future, anything that relies on OAuth2 functionality should just go to the base OAuth2 provider, and anything specific to any of the providers can go in their own files. And we suggest people who currently use Generic
migrate to just using the base OAuth2 provider.
what is the point of GenericOAuthenticator?
I believe GenericOAuthenticator should eventually go away and be merged into the base class, as long as we can do it smoothly and it doesn't complicate things unnecessarily (I don't think it will). I think the big refactor in #526 was a big step toward making that feasible.
Not having done that yet, going forward in general I think we should avoid new features in Generic in favor of putting them in the base class, at least in most cases.
I was thinking about how we can do a very common thing - sync GitHub team memberships and org memberships to JupyterHub groups. With this PR as is, that's actually not possible - team memberships are not part of userdata! But they are part of the auth_model (because there's a populate_teams_in_auth_model
property already).
I think there are two paths forward here for groups to work everywhere:
auth_model
, not userdata. This also allows for using things like scope
(which can control group membership with Auth0, from another community we are working with). We can name this appropriately, and the keep a deprecated claims_group_key
in Generic that notifies it's deprecated but works the way it already does. My preference is to do (1)! There's a clean backwards compat story, and the model of 'put stuff into auth_state, that is available via auth_model, and you can pull stuff out of that for groups' seems clean enough to explain.
Option 1 sounds like a great choice, and also a great opportunity to not inherit the name while preserving compatibility.
This went through a few iterations, but is ready for review again! I've updated the PR body with more detailed information as well. Please take another look when you got time, @minrk :)
After trying to write clear docs for the traitlet, I came to the conclusion that we shouldn't allow usage of these without manage_groups
also being True. I've updated a paragraph under Backwards compatibiltiy
with my rationale.
With that, this is ready to go!
I found https://github.com/jupyterhub/oauthenticator/issues/709, which I think is closed by this PR. And my change with respect to manage_groups
reflects:
do we need to deal with additional groups, not specified upstream somehow? I'd say no, at least for now.
Which I totally agree with. Groups should have a 'single source of truth'.
Can I get someone to hit merge? Thanks!
@yuvipanda thank you for working this!!!
I'm not able to invest time to review this fully atm :/ I saw some minor fixes I'd like to see updated:
Thanks @consideRatio.
I'm not able to invest time to review this fully atm :/
I don't think you need to, as long as you don't block it. I think it's seen enough eyes, and it's been sitting here for close to 3 months. In general I'd like us to move faster and be less perfectionist. For example, I see tests are failing now - probably unrelated, as they're in globus. This is super demotivating, and I don't want to lose myself as a contributor. I'm also going to proactively merge other PRs and change my attitude a little bit more, being the change I want to see.
With that said, I appreciate the other 3 points you have raised and the way you have communicated them. I've moved the references to 16.4 to 17.0, and added a section called Breaking Changes
to call this out explicitly (rather than inside the Backwards compatibilty
section as before. I prefer the PR title as is, but you're welcome to change it if you wish.
These tests are failing on main
as well, so I opened https://github.com/jupyterhub/oauthenticator/issues/743.
Back to now thinking this PR is ready to merge.
yay, thank you @manics
@yuvipanda there are numerous hooks to facilitate local customisation. What API should local code in these hooks use to access the list of groups that the user is member to?
For example, depending on group membership, I want to filter my profile list and adjust singleuser server settings. If my code runs in auth_state_hook
, options_form
, pre_spawn_hook
or modify_pod_hook
, then it will be passed the spawner instance, or if runs in post_auth_hook
it will be passed the authenticator instance and auth_model
(not auth_state
). I take it that accessing auth_model["groups"]
would be discouraged? Accessing spawner.user.groups
yields the same as user.orm_user.groups
, a list of SQAlchemy objects rather than a trivial set of strings to test membership in. Should customisations try to work with that, or try to call something like spawner.authenticator.get_user_groups(spawner.user.get_auth_state())
?
Motivating use cases
External identity providers providing JupyterHub memberships is an extremely useful feature that should be present not just for
GenericOAuthenticator
but for all authenticators. But to do so in helpful ways, this PR considers two motivating use cases:Auth0OAuthenticator
. In Auth0,scope
s granted are what is used to provide the notion of 'can the user perform this task?', which can be used as group membership. This is what auth0 recommends, there is currently no other way to do 'groups' in Auth0.scope
is insideauth_model
, but notauth_state
, sincescope
is granted each time the user is logged in.GitHubOAuthenticator
, we put the list of teams the user is in insideauth_state
. This is the perfect piece of information to use for group membership.oauth_user
gets put insideauth_state
, and in generalauth_state
is a good place for this kinda group information to be in. Authenticators can put arbitrary stuff insideauth_state
and use them as they wish.Approaches considered and rejected
auth_model
with aauth_model_groups_key
. This would be same as the currentclaim_groups_key
, but pick from the entireauth_model
instead of just from the returned user object. This was the tack this PR was taking, primarily because I thought we needed it to handle use case 1 mentioned earlier. But turns out I was wrong - I had thought thatscope
was part ofauth_model
but notauth_state
, but we do! And regardless, I also realized we don't exposeauth_model
anywhere, but we do exposeauth_state
. And I had a TODO for 'document what is inauth_model
', and while trying to do that, decided we shouldn't expose that to configurable tweaks like this for now. So that was reverted in b337015306f51a8a9ea8e95924a7562bfa1e56ba and a different approach was taken.Approach this PR takes
The general approach to group management is:
auth_state
.auth_state_groups_key
that can be either a callable or dotted specification that generates a list of groups from something inauth_state
.This handles case (2) because list of teams is already in
auth_state
. And can handle (1) by us puttingscope
in some form insideauth_state
. This also provides a clear extensible mechanism in the future for all group management - get it intoauth_state
(where it can be used for anything), and pick that out withauth_state_groups_key
.Backwards compatibility
claim_groups_key
behavior is preserved, by being passed on toauth_state_groups_key
in the base. It has been marked as deprecated. This is not a backwards compatibility break.manage_groups
to beTrue
, which was not the case earlier. Before this, ifmanage_groups
is false but any of the group related authorization functionality (allowed_groups
andadmin_groups
) is used, they control group related behavior but don't show up as JupyterHub groups. This causes confusion, as the 'groups' field in the admin panel will be empty, and possible other group related behavior (such as future profile list filtering, for example) would not respect these groups. We basically would end up with two group concepts - First class JupyterHub groups (which will show up in admin panel, API, can be edited by admins, etc) as well as second class 'Authenticator' groups (which are only used for authorization and 'disappear' after that). I think this is unnecessary complication, and this is a good time to remove this distinction. Now, any kind of group related authorization functionality requiresmanage_groups
to beTrue
, and we are back to having only one notion of 'group'. We also remove the confusing part where you may haveallowed_groups
set to something, manually modify the groups the user is a part of in JupyterHub admin, and it silently has no effect. This is a breaking change for people who used groups functionality but setmanage_groups
to beFalse
. However, I think that usage is fairly minor, because of the confusing behavior it causes. I have added the 'breaking' label here regardless.Breaking change
allowed_groups
,admin_groups
,claims_group_key
andauth_state_groups_key
) now also requiresmanage_groups
to be set toTrue
TODO
auth_state_groups_key
Fixes https://github.com/jupyterhub/oauthenticator/issues/709