Closed dany74q closed 3 years ago
Hi @dany74q, is there any reason that you would rather keep using the legacy AAD?
Hey @weinong !
As discussed in our short email exchange - this also happens in the managed AAD integration for SPN-s w/ group overage; In the flow, the SPN oid is still passed to the graph API - it 404s (as it's not a user-entity) and that fails the request.
I think it would be great to migrate to the new (now GA) API that can list group memberships for both SPNs and users: https://docs.microsoft.com/en-us/graph/api/directoryobject-getmembergroups?view=graph-rest-1.0&tabs=http
Two questions remain and I'd really appreciate your help on those:
According to the docs, the directory objects API requires the Directory.Read.All graph permission; in order for this to work - should this permission be added to the managed AKS AAD Server application ?
I couldn't find the AKS service that exchanges the token; would I be able to mimic this with regular OBO instead ? or instead, could I get access to the first party AKS obo service ?
Thank you for your time !
User.ReadBasic.All and GroupMember.Read.All
. I believe what we configured on AKS server app should sufficeHey @weinong - Thanks again !
I've tested a user-initiated OBO flow according to the spec (and later w/ your awesome script) - the current permissions only suffice to read the memberships of users and groups via the new API, but not for applications / spns;
In order for it to work w/ service principals, I had to also grant the Application.Read.All (c79f8feb-a9db-4090-85f9-90d820caa0eb)
permission (which requires admin consent); at that point it worked great for querying both users and service principals via the new API endpoint.
Two concerns that came up while I was tinkering with it were:
The actual flow in AKS, though, would have an spn issuing a client_credentials token for AKS' server app - so there is no user interaction on that front; I assume the AKS first party service OBO does something behind the scenes to make the actual exchange succeed (maybe it saves a user's refresh token from the time of cluster creation, or something similar - not sure).
Either way, I can't be fully convinced it would work with the AKS obo service w/o you - would really appreciate your help on that front.
Thing is, I couldn't really figure out where the current graph permissions are granted - I've looked at the AKS server app service principal in my tenant, and I don't see any delegated graph permissions in the response, nor do I see it had been user/admin consented in any way.
Given that in a normal OBO flow, the server app is the one which needs the actual graph permissions, I assume that the AKS server app does have such permissions, but they are only viewable from the app registrator directory - that, or the AKS obo service uses another application, or has some other logic I don't fully comprehend.
Either way - do you think it would be possible to grant such additional permissions in one centralized place and it would just-work for both existing and new tenants, or would the app need to be re-registered for existing tenants ?
Thanks yet again !
@weinong - I've opened a PR that migrates to the new endpoint.
Two things that are worth noting:
The new endpoint does not support retrieval via UPNs, but only object ids; I've read in the docs that, potentially, there could be cases where oid is blank and there's no guarantee it's actually passed-in; as such - I've used the new endpoint only when there's an oid claim present
UPNs of external users may contain characters in need of escaping (e.g. "#") - I've addressed that in the PR as well.
I've tested it via the obo flow per the spec, but would love if you'd be able to spin it up and test it against the AKS managed obo service (the comment above about the extra permission still holds, I've updated the docs accordingly).
hmm. how did you test it if obo flow is not supported for app?
I've tested it with both the client credentials and regular obo flow guard supports; The first consisted of a guard app that has the Application.Read.All, GroupMember.Read.All and User.Read.All graph API roles.
The latter consisted of a guard app that has the Application.Read.All, GroupMember.Read.All and User.ReadBasic.All graph API delegated scopes. In this flow, the first token (that which the audience of is the OBO guard app) must be a user-initiated token for the exchange to work - that's what I did (created a client app, got a user-initiated token via the device code flow, passed it on to guard).
The AKS-obo first party service is unique in the way it seems to support exchange via tokens issued from SPNs via a client_credentials flow - and from all the docs I've read, I can't seem to mimic such a behavior.
Does the above sound reasonable ? Thanks @weinong !
The AKS-obo first party service is unique in the way it seems to support exchange via tokens issued from SPNs via a client_credentials flow - and from all the docs I've read, I can't seem to mimic such a behavior.
How are you sure about this?
... This is expected: OBO for service principals is not a supported scenario yet (by Azure AD) in 3rd party applications.
I think that this is the culprit - the AKS obo service uses a 1st party app behind the scenes, thus SPN obo is supported in that flow.
I can't test it myself (don't have access to a 1st party app) - using a 3rd party app, AAD throws the AADSTS7000114 error (<app> is not allowed to make application on-behalf-of calls
)
I see. That's a misunderstanding! It's supported in managed AAD with a caveat: it doesn't support more than 250 groups. The reason is simple. Guard will only do the obo flow when there is overage claim! So when SP has more than 250 groups, it will be broken as well like you saw in third party app. So implementing the directory getMemberGroups is kinda a moot.
I misread your comment. I guess I will have to test it out using 1P app first!
Hey @weinong, hope you had an awesome weekend !
Do you think you'll be able to spin it out for a drive in the following couple of weeks ?
Appreciate your time !
I'm planning to test it this week
I tested it and also inquired the internal team to confirm my finding: unfortunately obo flow for SP is not supported. the suggestion is to use application permission to query SP's group membership. For your issue, I'd suggest opening an issue on https://github.com/Azure/AKS/issues
I see, thanks a ton for testing it out @weinong.
So just to clarify - given a group overage - guard will only work for user-initiated tokens; Meaning sp-only overage isn't supported in either the manual, nor the aks-mamaged OBO flow - Does that sound right ?
Secondly - I wonder, would it not still in fact be beneficial to migrate to the new API ? This gives one the ability to manually set up guard with the new permissions to make it work for both users and service principals (in a client credentials setup, for instance);
Also, it is backwards compatible with the previous API - users may safely upgrade and add the new permissions later on to enable this use case.
Thanks yet again !
So just to clarify - given a group overage - guard will only work for user-initiated tokens; Meaning sp-only overage isn't supported in either the manual, nor the aks-mamaged OBO flow - Does that sound right ?
correct
Secondly - I wonder, would it not still in fact be beneficial to migrate to the new API ? This gives one the ability to manually set up guard with the new permissions to make it work for both users and service principals (in a client credentials setup, for instance);
From non-AKS' perspective, yes, I see the benefit, but it will not work with AKS at all. So it comes down to whether there is any real need or not in non-AKS environment.
Thanks for the clarification !
@tamalsaha - Do you think this could prove beneficial for self deployed guard installations ?
To summarize, this gives one the ability to connect to the cluster via a service principal vs user-initiated tokens in a client credentials flow - which is beneficial for backend services connecting to the cluster, sans the management of id & refresh tokens.
@dany74q I have never heard such request in Guard community, albeit there is very few people using it in unmanaged fashion. I work closely with ARC team who uses Guard in self-managed fashion, and I haven't got their (or their customers) complaint. Yes, they are very well aware of this limitation, too. So I feel there is no need to address this for unmanaged Guard until someone speaks up.
just my 2 cents.
Hey hey !
In the past few days, I've had a chance to look over the AKS-AAD integrations (which utilize guard, under the hood); That was during some research I'm doing for implementing non interactive logins for AAD enabled clusters (both legacy & managed).
Specifically, I'm looking at auth via service principals, passing a client_credentials-flow token, which holds a service principals object id claim and no UPN claim.
Managed AAD clusters have it mostly solved (a-la kubelogin) - one only needs to issue a token for the multi tenant AKS server app, and any directory entity could be used in this flow - the groups JWT claim is considered along with any overage data.
However, in case of overage - ms graph is consulted, and the request fails for SPNs, as msgraph 404-s fetching group memberships for service principals; This is because the graph API currently used only supports retrieving memberships for users (and not service principals).
As for legacy clusters, the groups JWT claim isn't considered at all -
ms graph is always consulted in fetching group memberships for the given object id; Specifically, the "/users/id/getMemberGroups" endpoint is used.
When an spn oid is passed - the API above 404-s in the same manner and it fails the request altogether.
I believe that ms graph had no API for retrieving groups for any given entity back in the day, but now one does exist -"/directoryObjects/id/getMemberGroups".
I was wondering if it would be a good idea to migrate to the new endpoint - this would enable non interactive login flows to legacy clusters and fix the flow for managed integrations for SPNs assigned to many groups.
Otherwise, it might make sense to have the ms graph call be best effort - returning blank groups in cases of error; It's a bit unfortunate that not being able to retrieve groups, fails the auth attempt altogether - when reaching that code path, we have a verified JWT at hand with some object id, it might have made sense to pass it onward and check for any direct k8s role mapping.
Another suggestion might be to flip the flag which considers the given groups claim (on AKS side) for legacy clusters - closing the disparity between the two integrations.
Would love to hear your two cents on this @weinong
Thanks !