WLCG-AuthZ-WG / common-jwt-profile

A repo for the WLCG Common JWT profile document
3 stars 8 forks source link

Add wlcg.credkey ID token claim #12

Closed DrDaveD closed 3 years ago

DrDaveD commented 3 years ago

This is a proposed solution for a new ID token claim needed for controlling access to storing refresh tokens in vault, as discussed in yesterday's working group meeting.

andreaceccanti commented 3 years ago

Hi @DrDaveD ,

(reporting here what I wrote on the list)

I would call the claim 'kerberos_principal' or something similar.

And it would be returned by the token issuer when the 'profile' scope is included, or, if preferred, when the 'kerberos_principal' scope is included.

I would avoid the parametric scope, as it introduces complexity that should be moved on the client side, being client-specific.

DrDaveD commented 3 years ago

Andrea,

You are forgetting the whole point for this claim. Vault is an open source software package that we do not own and we can't expect them to support a special-purpose edit. It provides a mechanism to choose any id token claim as credential key, and no mechanism to edit it. So the exact key has to be provided by the token issuer.

Given that, the key that is needed is not exactly a kerberos principal, it is an edited kerberos principal. If only a kerberos principal was needed, I would use the standard claim "eppn". I think credkey is exactly descriptive, and a good name. It is general enough that it could be used for other purposes besides kerberos, and by other clients that need a similar key. The parameter on the key is necessary because the only mechanism that Vault provides to send custom configured information from client to token issuer is via scopes.

For Indigo IAM we can get by for now without implementing this claim, if instead you support the "eppn" ID token claim. If later we find that people want to use kerberos robot credentials, then we will need it.

msalle commented 3 years ago

But why would wlcg.credkey be more generic than kerberos_principal ? Also I think credkey is actually quite a confusing name. To me it would imply something relating to a (private) key. It basically needs to be a claim providing (yet another) persistent and unique identifier, right? Finally, I agree with Andrea that the parametric scope makes this very complex and very specific to certain use-cases.

I probably still don't have fully clear why we need such a new identifier. I think the argument was that sub could change for a user, but then the question is how would you know that you're still talking about the same user (and we've had quite some discussions on that in different projects). It then would be more practical to either agree not to change sub (which we could if the number of issuers is low and under our control) or to agree to provide a persistent and unique identifier for the user, for example eduPersonUniqueId (ePUId) or subject-id.

DrDaveD commented 3 years ago

But why would wlcg.credkey be more generic than kerberos_principal ?

It is not the kerberos principal, it is the key used for storage. It's currently only the kerberos principal for one domain and otherwise it is a unique value. The same name could conceivably be used for other storage keys in the future as well, returning a different value for a different token issuer or different client.

Also I think credkey is actually quite a confusing name. To me it would imply something relating to a (private) key.

There's more than one use of the word key in computer science. The meaning in this case is as in key-value, not as in encryption.

It basically needs to be a claim providing (yet another) persistent and unique identifier, right?

Yes but a specific one that maps to kerberos ids in one domain.

Finally, I agree with Andrea that the parametric scope makes this very complex and very specific to certain use-cases.

Ok, but don't you agree it is a good goal to use an off the shelf open source product and not require it to have customizations? It's still a simple thing for token issuers to implement, which are already custom products, so I think it's a fair compromise.

I probably still don't have fully clear why we need such a new identifier. I think the argument was that sub could change for a user, but then the question is how would you know that you're still talking about the same user (and we've had quite some discussions on that in different projects). It then would be more practical to either agree not to change sub (which we could if the number of issuers is low and under our control) or to agree to provide a persistent and unique identifier for the user, for example eduPersonUniqueId (ePUId) or subject-id.

Whatever we define for the sub claim will also go into the access token and this value doesn't make sense there. This is only for ID tokens, and it has a special definition so I think it makes complete sense for it to be a new name.

msalle commented 3 years ago

But why would wlcg.credkey be more generic than kerberos_principal ?

It is not the kerberos principal, it is the key used for storage. It's currently only the kerberos principal for one domain and otherwise it is a unique value. The same name could conceivably be used for other storage keys in the future as well, returning a different value for a different token issuer or different client.

I think I need to have the use-case much more clear. I really hope we can (and think we should be able to) come up with a more generic way than adding an ad hoc new claim.

Also I think credkey is actually quite a confusing name. To me it would imply something relating to a (private) key.

There's more than one use of the word key in computer science. The meaning in this case is as in key-value, not as in encryption.

I know what you mean with it but in OIDC and OAuth2 there are several keys which typically have to do with private/pubkeys. I think naming it something-key would be very confusing for people.

It basically needs to be a claim providing (yet another) persistent and unique identifier, right?

Yes but a specific one that maps to kerberos ids in one domain.

But why can't that be imposed on the issuer side? That issuer knows that? So for example kerberos_principal + saml scope in ePPN for those issuers, and 'normal' ePPN for others? And if you don't want to (re/ab)use ePPN for that, why can't you just add a kerberos_principal claim for issuers that have it and simply not produce the claim for other issuers?

Finally, I agree with Andrea that the parametric scope makes this very complex and very specific to certain use-cases.

Ok, but don't you agree it is a good goal to use an off the shelf open source product and not require it to have customizations? It's still a simple thing for token issuers to implement, which are already custom products, so I think it's a fair compromise.

I agree that no customizations is good, but you do need a customization now too, namely a custom claim with a rather specific semantics? Vault will need to implement that?

I probably still don't have fully clear why we need such a new identifier. I think the argument was that sub could change for a user, but then the question is how would you know that you're still talking about the same user (and we've had quite some discussions on that in different projects). It then would be more practical to either agree not to change sub (which we could if the number of issuers is low and under our control) or to agree to provide a persistent and unique identifier for the user, for example eduPersonUniqueId (ePUId) or subject-id.

Whatever we define for the sub claim will also go into the access token and this value doesn't make sense there. This is only for ID tokens, and it has a special definition so I think it makes complete sense for it to be a new name.

I really need to understand the issue a bit better.

hannahshort commented 3 years ago

Hiya,

I agree that credKey seems a little confusing/misleading. I get that Vault needs to be generic, but that doesn't mean that our usage of it needs to be. The proposal sounds overly generic somehow. We are specifically mapping between an OAuth token and a kerberos token, in a way that the OAuth token must contain the kerberos principal, right? Maybe I'm misunderstanding something important

DrDaveD commented 3 years ago

I think I need to have the use-case much more clear.

I don't know how to explain it any more clearly. It has to be a claim fully coming from the token issuer, and it has to be able to map to a storage location in vault that we can also map from users in at least one kerberos domain.

I really hope we can (and think we should be able to) come up with a more generic way than adding an ad hoc new claim.

I don't see how. It's a special case. I also don't see why it is a big deal. We have defined several other claim names in this profile, and this one is only to communicate between the client and token issuer, never seen by users.

Also I think credkey is actually quite a confusing name. To me it would imply something relating to a (private) key.

There's more than one use of the word key in computer science. The meaning in this case is as in key-value, not as in encryption.

I know what you mean with it but in OIDC and OAuth2 there are several keys which typically have to do with private/pubkeys. I think naming it something-key would be very confusing for people.

The only other computer science term that I can think of is index. I don't think it as widely used nor as descriptive. In any case people will need the specification to fully understand.

It basically needs to be a claim providing (yet another) persistent and unique identifier, right?

Yes but a specific one that maps to kerberos ids in one domain.

But why can't that be imposed on the issuer side? That issuer knows that? So for example kerberos_principal + saml scope in ePPN for those issuers, and 'normal' ePPN for others?

I assume here you're talking about avoiding passing the parameter. That's what I thought at first, but there very likely will be cases where a different kerberos domain will be supported by different vault clients. Especially for a VO like DUNE, we at Fermilab at least have often talked about accepting job submissions from people that have accounts in cern.ch or fnal.gov. I expect each would have its own vault server supporting its kerberos domain. This could be extended to other domains as well. I thought it would be easier on the token issuer to not have to have to support a configuration item per oauth client which is the domain to strip out, although that is certainly an alternative if the token issuer implementers prefer that.

And if you don't want to (re/ab)use ePPN for that, why can't you just add a kerberos_principal claim for issuers that have it and simply not produce the claim for other issuers?

Wait, are you saying kerberos_principal is a standard oauth2-defined claim? I haven't been able to find it anywhere. Also keep in mind that we care about token issuers that support federated identity, which is not something that oauth typically thinks about.

Finally, I agree with Andrea that the parametric scope makes this very complex and very specific to certain use-cases.

Ok, but don't you agree it is a good goal to use an off the shelf open source product and not require it to have customizations? It's still a simple thing for token issuers to implement, which are already custom products, so I think it's a fair compromise.

I agree that no customizations is good, but you do need a customization now too, namely a custom claim with a rather specific semantics? Vault will need to implement that?

No, that's the whole point of defining this claim! Vault already has a configuration item that it calls "user_claim" which is the name of the claim from the issuer that can then be mapped by a policy rule to a storage path. No edits are allowed on the contents of the claim, but other components of a storage path before and after that claim can be added in the policy rule, including a * to match anything.

Hannah wrote:

I agree that credKey seems a little confusing/misleading. I get that Vault needs to be generic, but that doesn't mean that our usage of it needs to be. The proposal sounds overly generic somehow. We are specifically mapping between an OAuth token and a kerberos token, in a way that the OAuth token must contain the kerberos principal, right? Maybe I'm misunderstanding something important

Hopefully my responses to Mischa helped you understand.

andreaceccanti commented 3 years ago

Hi Dave,

it seems to me that we are trying to push a vault-specific glue claim in the profile. Also, I still do not understand why why sub cannot be used as the key, if it's just a key.

Whatever we define for the sub claim will also go into the access token and this value doesn't make sense there. This is only for ID tokens, and it has a special definition so I think it makes complete sense for it to be a new name.

What's the problem with the sub being also in ID tokens? The sub doesn't change for a user in a VO

Ok, but don't you agree it is a good goal to use an off the shelf open source product and not require it to have customizations?

Sure, but I still do not understand why using sub would require modifications on the vault side. Also, we can push for meaningful changes to open source software if these changes are oriented to support standard integration practices.

It's probably better to discuss this in more detail at one of the next authz calls. Having a description of how htgettoken integrates with vault (with details!) can help in supporting the definition of a claim that may help you.

DrDaveD commented 3 years ago

it seems to me that we are trying to push a vault-specific glue claim in the profile.

It could be viewed that way, but is that a problem? I was originally thinking that we would do this as just an agreement between the implementers of the two token issuers and not in the profile, but why not write it down in the profile? The original proposal to the working group was to use the word "vault" in the claim name but there were objections that that wasn't generic enough. It may very well be for example that the planned oidc-agent "my-token" server will be able to make use of the same thing because its architecture as it was described is looking to be quite similar to the vault architecture.

Also, I still do not understand why sub cannot be used as the key, if it's just a key.

Whatever we define for the sub claim will also go into the access token and this value doesn't make sense there. This is only for ID tokens, and it has a special definition so I think it makes complete sense for it to be a new name.

What's the problem with the sub being also in ID tokens? The sub doesn't change for a user in a VO

It's not just a key, it's a specific key. I'm sure you don't want to change the sub in the access token to be the kerberos user id for one IdP, and something else unique for everybody else.

Ok, but don't you agree it is a good goal to use an off the shelf open source product and not require it to have customizations?

Sure, but I still do not understand why using sub would require modifications on the vault side.

I don't know how many different ways I can say the same thing so it will finally be understood. In order to avoid custom vault modifications we need a string that is unique per person that we can also map to a kerberos principal of the same person, using vault configuration options already available.

Also, we can push for meaningful changes to open source software if these changes are oriented to support standard integration practices.

I can't imagine that the owners of this very widely used product will be interested in accepting such a use-case specific modification. They have already rejected another contribution I tried to make that was more generic. On the other hand, our token issuers have far fewer use cases and are custom built for our application. I think the answer is clear that the modification should be on the token issuer side.

It's probably better to discuss this in more detail at one of the next authz calls. Having a description of how htgettoken integrates with vault (with details!) can help in supporting the definition of a claim that may help you.

I was hoping we could make more progress before the next time I can be there in 3-1/2 weeks but I guess it is not to be.

andreaceccanti commented 3 years ago

I don't know how many different ways I can say the same thing so it will finally be understood.

Probably we need a clear, referenceable, written, understandable description of exactly how the htgettoken-vault integration works.

DrDaveD commented 3 years ago

I had a separate discussion with Andrea and Mischa and they believe this is not something that should be part of the WLCG profile, so I am closing this PR. They think it is something that should be solved on the client side, but even if it is on the token issuer side it is too specific to the vault client to be in the profile.

We also did discover in looking at the vault plugin api spec something else that might be helpful, that there are two claims that can be configured and mapped into policies, called the user_claim and the group_claim. An alternative, then, to using a single claim giving different results for a preferred domain would be for the token issuer to return the username and domain in two different claims. The defined SAML name for that is schacHomeOrganization which could be used for the domain name, but there isn't one for the corresponding user name. The best combined field would name be the eduPersonUniqueId or epuid because unlike eppn it may not be reassigned to another person.