Add options to limit scopes or audience of access token

puppetlabs / vault-plugin-secrets-oauthapp

OAuth 2.0 secrets plugin for HashiCorp Vault supporting a variety of grant types

Apache License 2.0

94 stars 10 forks source link

Add options to limit scopes or audience of access token #18

Closed DrDaveD closed 3 years ago

DrDaveD commented 4 years ago

This adds "scopes" and "audience" options to the creds read. It works by doing an RFC8693 token exchange using the regular access token.

Since this touches some of the same code as pr #16, it includes those changes and should be merged after that one. To see just the changes for this pr, see my pr #1.

DrDaveD commented 3 years ago

I force-pushed to put the commit messages in ESLint format

DrDaveD commented 3 years ago

I force-pushed after rebasing on current #16

DrDaveD commented 3 years ago

I force-pushed after rebasing on master, after the merge of #16.

I plan to add tests for these two new options soon.

DrDaveD commented 3 years ago

Ok I finally got the tests added. It was much more involved than I expected. I first went far down a path of trying to use a MockExchangeFunc with an additional "opts" parameter, but ran into a roadblock of the oauth2.AuthCodeOption having only private (beginning with lower case) accessors. I threw that all out and instead borrowed a test approach from vault-plugin-auth-jwt using an httptest server. This has a pleasant side effect of testing additional plugin code, including the refresh_token option. A lot more code could also get exercised using this approach. For instance it does not yet test the auth_code_url path, or discovery_url option.

impl commented 3 years ago

Hey @DrDaveD,

I've been dragging my feet on addressing this PR for a while, but I do really like the changes here and want to integrate them. The main problem I've been struggling with is the feeling that the token exchange operation feels more like a write/update operation and not like something that should be a side effect of a read operation. I also see that the spec itself allows such exchanges to themselves return tokens that can be refreshed (although I also understand that's a relatively uncommon mode in reality).

Initially, it seemed like what you'd want to do is to create a new credential that references an existing credential. (For example, vault write creds/impersonated-cred impersonate=root-cred audiences=... scopes=..., where creds/root-cred already exists.) But this breaks Vault's security model because it would potentially allow Vault users to derive new credentials from authentication tokens they don't even have access to.

I also don't see any way to implement delegated token exchange, where an actor_token is specified, under our current design. Open to suggestions if this is something you think is valuable.

All of that said, I've come up with an alternative that I hope should make everyone happy.

I'm going to go ahead and merge this into a branch on our side and do some incremental updates to it to make the API look like this:

PUT creds/<cred-name>/tokens: Perform a one-time token exchange and simply return the resulting access token to the user.
PUT creds/<cred-name>/tokens/<token-name>: Perform a token exchange and store the resulting access token and (potentially) refresh token. If a refresh token is specified, the impersonated credential will continue to refresh, even if the root credential expires.

As I was working through this design, I also considered two other endpoints that seem like a good idea to implement:

PUT creds: Perform a one-time authorization code exchange and simply return the resulting access token. This is nice if you really only need to use something once and not reference it again, as is the usual case with STS.
PUT creds/<cred-name>/refresh, PUT creds/<cred-name>/tokens/<token-name>/refresh: Force a refresh if the credential supports it.

This is going to take me a little bit to work through, so let me know if you have any thoughts on these changes. I'll be working on it on the features/token-exchange branch.

On a personal note, I'd like to sincerely thank you for your numerous contributions to this project. We really appreciate the initiative and we're very happy to see this plugin mature over time. Happy holidays!

DrDaveD commented 3 years ago

I've been dragging my feet on addressing this PR for a while, but I do really like the changes here and want to integrate them.

That's great to hear.

The main problem I've been struggling with is the feeling that the token exchange operation feels more like a write/update operation and not like something that should be a side effect of a read operation.

Making it into a write operation on a subpath is fine with me, especially if the new access token is cached. I don't know about also returning a value at the same time, that feels a little strange as a side effect of write. Currently when the refresh_token is written during the authorization step it is just a write, and then the client needs to turn around and read it again. It's an unfortunate extra step, but it is clean. On the other hand, I do see that config/auth_code_url for example is a PUT that retrieves something, so I guess having writes retrieve something is part of the model.

I also see that the spec itself allows such exchanges to themselves return tokens that can be refreshed (although I also understand that's a relatively uncommon mode in reality).

As a matter of fact I did recently think of a potential use case for getting a new refresh token, with reduced scopes, using token exchange. After the initial OIDC authentication we plan to renew our vault credentials weekly with kerberos. In addition to the user's regular kerberos credentials, we have the ability to create "robot" kerberos credentials of the form <user>/cron/<machinename> which are to be used for automated operations via cron. We can translate a credential of that form to a subpath in vault under creds using a vault policy we apply to the auth-kerberos plugin (if I also use a small PR to this secrets-oauthapp plugin that accepts '/' as part of the <cred-name> that I haven't submitted yet and I hope you'll accept). It might be useful to restrict the scopes that that robot has access to in its refresh token instead of storing an exact copy of the user's full refresh token. My current plan is to define a new role in the auth-jwt plugin that asks for those reduced scopes if we need it, but doing a token exchange instead would be able to make it entirely under the user's control and not require the vault administrator to define a new role, which would be an advantage.

Initially, it seemed like what you'd want to do is to create a new credential that references an existing credential. (For example, vault write creds/impersonated-cred impersonate=root-cred audiences=... scopes=..., where creds/root-cred already exists.) But this breaks Vault's security model because it would potentially allow Vault users to derive new credentials from authentication tokens they don't even have access to.

Agreed.

I also don't see any way to implement delegated token exchange, where an actor_token is specified, under our current design. Open to suggestions if this is something you think is valuable.

I hadn't even noticed that feature. I currently can't think of any use cases that we might run across.

All of that said, I've come up with an alternative that I hope should make everyone happy.

I'm going to go ahead and merge this into a branch on our side and do some incremental updates to it to make the API look like this:
* `PUT creds/<cred-name>/tokens`: Perform a one-time token exchange and simply return the resulting access token to the user.

* `PUT creds/<cred-name>/tokens/<token-name>`: Perform a token exchange and store the resulting access token and (potentially) refresh token. If a refresh token is specified, the impersonated credential will continue to refresh, _even if_ the root credential expires.

Ok, those look like they would work. I assume the second one will also return the access token. Are you expecting the refresh_token to be passed in as a parameter, as is now supported? That would actually work for my use case because I'm getting it from the auth-jwt plugin but people doing authentication through this plugin don't get it. Oh, but in the latter case they would instead have a 'code' that results in getting a refresh_token from the issuer, so you could use that. However, I'm probably thinking of a design slightly different than what you were thinking, because a consequence of doing it this way is that the root credential would never be stored. That's good for my use case because I only want the more restricted refresh token accessible to the robot.

As I was working through this design, I also considered two other endpoints that seem like a good idea to implement:
* `PUT creds`: Perform a one-time authorization code exchange and simply return the resulting access token. This is nice if you really only need to use something once and not reference it again, as is the usual case with STS.

Oh but where would the original access token come from in this case? Are you talking about passing in access_token as an input parameter?

* `PUT creds/<cred-name>/refresh`, `PUT creds/<cred-name>/tokens/<token-name>/refresh`: Force a refresh if the credential supports it.

Forcing a refresh can already be done with the GET minimum_seconds option so I don't think this should be necessary.

This is going to take me a little bit to work through, so let me know if you have any thoughts on these changes. I'll be working on it on the features/token-exchange branch.

On a personal note, I'd like to sincerely thank you for your numerous contributions to this project. We really appreciate the initiative and we're very happy to see this plugin mature over time. Happy holidays!

I appreciate your willingness to accept them. It's very nice to be able to share a general purpose plugin instead of having to fork and maintain our own.

impl commented 3 years ago

Making it into a write operation on a subpath is fine with me, especially if the new access token is cached. I don't know about also returning a value at the same time, that feels a little strange as a side effect of write. Currently when the refresh_token is written during the authorization step it is just a write, and then the client needs to turn around and read it again. It's an unfortunate extra step, but it is clean. On the other hand, I do see that config/auth_code_url for example is a PUT that retrieves something, so I guess having writes retrieve something is part of the model.

It would only be a "write-with-data" in the case where you don't want to store the credential... more below. I think we're on the same page though.

As a matter of fact I did recently think of a potential use case for getting a new refresh token, with reduced scopes, using token exchange. After the initial OIDC authentication we plan to renew our vault credentials weekly with kerberos. In addition to the user's regular kerberos credentials, we have the ability to create "robot" kerberos credentials of the form <user>/cron/<machinename> which are to be used for automated operations via cron. We can translate a credential of that form to a subpath in vault under creds using a vault policy we apply to the auth-kerberos plugin (if I also use a small PR to this secrets-oauthapp plugin that accepts '/' as part of the <cred-name> that I haven't submitted yet and I hope you'll accept). It might be useful to restrict the scopes that that robot has access to in its refresh token instead of storing an exact copy of the user's full refresh token. My current plan is to define a new role in the auth-jwt plugin that asks for those reduced scopes if we need it, but doing a token exchange instead would be able to make it entirely under the user's control and not require the vault administrator to define a new role, which would be an advantage.

Makes sense! I am OK with that change to the acceptable path tokens as long as the intention is to escape the / -- e.g., would need to be creds/foo%2fbar and not creds/foo/bar.

Ok, those look like they would work. I assume the second one will also return the access token. Are you expecting the refresh_token to be passed in as a parameter, as is now supported? That would actually work for my use case because I'm getting it from the auth-jwt plugin but people doing authentication through this plugin don't get it. Oh, but in the latter case they would instead have a 'code' that results in getting a refresh_token from the issuer, so you could use that. However, I'm probably thinking of a design slightly different than what you were thinking, because a consequence of doing it this way is that the root credential would never be stored. That's good for my use case because I only want the more restricted refresh token accessible to the robot.

I should have been more clear here! Once you write using PUT /creds/foo/tokens/bar, you'd get the same semantics as a regular credential endpoint. E.g. you could then GET /creds/foo/tokens/bar?minimum_seconds=600 or whatever you need. Only the PUT /creds/foo/tokens (without a name) would return the access token associated with the token exchange.

For managing access in your case, you could do something like this, I think?

For a root or administrative token (not robots), grant access to read/write to creds/+ to allow creation of tokens.
For the robot tokens, you have some options. If you'd like them to be able to reuse a single token (by name or robot ID), you could give them read/write to creds/parent/tokens/my-robot-id. Or you could give them the ability to generate a one-time code by allowing only write to creds/parent/tokens. Neither of these options would allow them to have access to any value of the parent token (not current access token or refresh token). If you need to control the scope of the impersonation at the parent level, you could allow them only read access to creds/parents/tokens/my-robot-id and create that path in advance using the administrative token.
I believe you can even use the Vault identity entities stuff here: https://learn.hashicorp.com/tutorials/vault/policy-templating, so something like creds/parent/tokens/{{ identity.entity.name }} in the Vault policy would let you uniquely allow access to certain child token paths depending on the auth results.

PUT creds: Perform a one-time authorization code exchange and simply return the resulting access token. This is nice if you really only need to use something once and not reference it again, as is the usual case with STS.

Oh but where would the original access token come from in this case? Are you talking about passing in access_token as an input parameter?

The query parameters for this would be the same as for creds/foo (so either refresh token or code + redirect URL), it just wouldn't store the resulting credential. You would get the access token back. This isn't related to this PR, per se, just something that popped into my mind when I was thinking about implementing the child endpoint to do the token exchange.

Forcing a refresh can already be done with the GET minimum_seconds option so I don't think this should be necessary.

Hmm, true, but you'd have to pull the existing token first and find a value outside the existing expiry to request as the minimum_seconds. I suppose this isn't too much work though.

DrDaveD commented 3 years ago

As a matter of fact I did recently think of a potential use case for getting a new refresh token, with reduced scopes, using token exchange. After the initial OIDC authentication we plan to renew our vault credentials weekly with kerberos. In addition to the user's regular kerberos credentials, we have the ability to create "robot" kerberos credentials of the form <user>/cron/<machinename> which are to be used for automated operations via cron. We can translate a credential of that form to a subpath in vault under creds using a vault policy we apply to the auth-kerberos plugin (if I also use a small PR to this secrets-oauthapp plugin that accepts '/' as part of the <cred-name> that I haven't submitted yet and I hope you'll accept). It might be useful to restrict the scopes that that robot has access to in its refresh token instead of storing an exact copy of the user's full refresh token. My current plan is to define a new role in the auth-jwt plugin that asks for those reduced scopes if we need it, but doing a token exchange instead would be able to make it entirely under the user's control and not require the vault administrator to define a new role, which would be an advantage.

Makes sense! I am OK with that change to the acceptable path tokens as long as the intention is to escape the / -- e.g., would need to be creds/foo%2fbar and not creds/foo/bar.

Let's move this separate discussion to pr #26.

Ok, those look like they would work. I assume the second one will also return the access token. Are you expecting the refresh_token to be passed in as a parameter, as is now supported? That would actually work for my use case because I'm getting it from the auth-jwt plugin but people doing authentication through this plugin don't get it. Oh, but in the latter case they would instead have a 'code' that results in getting a refresh_token from the issuer, so you could use that. However, I'm probably thinking of a design slightly different than what you were thinking, because a consequence of doing it this way is that the root credential would never be stored. That's good for my use case because I only want the more restricted refresh token accessible to the robot.

I should have been more clear here! Once you write using PUT /creds/foo/tokens/bar, you'd get the same semantics as a regular credential endpoint. E.g. you could then GET /creds/foo/tokens/bar?minimum_seconds=600 or whatever you need. Only the PUT /creds/foo/tokens (without a name) would return the access token associated with the token exchange.

Ok, that would be fine if the PUT with a tokens/name does not return an access token. Would the PUT itself include the same semantics as the regular credential endpoint, accepting refresh_token or code options in addition to the scope options to use for token exchange? If not, where would the refresh_token come from?

For managing access in your case, you could do something like this, I think?

* For a root or administrative token (_not_ robots), grant access to read/write to `creds/+` to allow creation of tokens.

* For the robot tokens, you have some options. If you'd like them to be able to reuse a single token (by name or robot ID), you could give them read/write to `creds/parent/tokens/my-robot-id`. Or you could give them the ability to generate a one-time code by allowing only write to `creds/parent/tokens`. Neither of these options would allow them to have access to any value of the parent token (not current access token or refresh token). If you need to control the scope of the impersonation at the parent level, you could allow them only read access to `creds/parents/tokens/my-robot-id` and create that path in advance using the administrative token.

* I believe you can even use the Vault identity entities stuff here: https://learn.hashicorp.com/tutorials/vault/policy-templating, so something like `creds/parent/tokens/{{ identity.entity.name }}` in the Vault policy would let you uniquely allow access to certain child token paths depending on the auth results.

Yes that makes sense. In fact, I think I would use the creds/parent/tokens/{{identity.entity.name}} format which is a potential way out of a dilemma mentioned in #26. I will follow up there.

PUT creds: Perform a one-time authorization code exchange and simply return the resulting access token. This is nice if you really only need to use something once and not reference it again, as is the usual case with STS.

Oh but where would the original access token come from in this case? Are you talking about passing in access_token as an input parameter?

The query parameters for this would be the same as for creds/foo (so either refresh token or code + redirect URL), it just wouldn't store the resulting credential. You would get the access token back. This isn't related to this PR, per se, just something that popped into my mind when I was thinking about implementing the child endpoint to do the token exchange.

Oh, ok.

Forcing a refresh can already be done with the GET minimum_seconds option so I don't think this should be necessary.

Hmm, true, but you'd have to pull the existing token first and find a value outside the existing expiry to request as the minimum_seconds. I suppose this isn't too much work though.

Yes and applications likely will already know the minimum time they need and won't always need to force a refresh, they'll just want to refresh if the time left is less than their minimum.

impl commented 3 years ago

Ok, that would be fine if the PUT with a tokens/name does not return an access token. Would the PUT itself include the same semantics as the regular credential endpoint, accepting refresh_token or code options in addition to the scope options to use for token exchange? If not, where would the refresh_token come from?

The access token for the RFC 8693 token exchange would come from the parent token. But the token exchange itself could also return a refresh token (similar to the initial authorization code exchange), which we would store securely and not expose to the user. This lets the exchanged token refresh independently from the parent token.

I wouldn't anticipate adding refresh_token, code, etc. to this particular endpoint (PUT creds/parent/tokens/foo), I don't think. To wit: if you already have a refresh token externally and you want to do a token exchange with it to get a new access token, what value does this plugin provide (I think of value being Vault's secure storage, primarily)?

Hmm, true, but you'd have to pull the existing token first and find a value outside the existing expiry to request as the minimum_seconds. I suppose this isn't too much work though.

Yes and applications likely will already know the minimum time they need and won't always need to force a refresh, they'll just want to refresh if the time left is less than their minimum.

The two scenarios I was thinking of were (1) for OpenID Connect, to force a refresh if you suspect the ID token/user info data has changed, and (2) because you think the user may no longer be valid from the IdP's perspective, so you want to check to make sure a refresh will succeed.

DrDaveD commented 3 years ago

The access token for the RFC 8693 token exchange would come from the parent token. But the token exchange itself could also return a refresh token (similar to the initial authorization code exchange), which we would store securely and not expose to the user. This lets the exchanged token refresh independently from the parent token.

Ok that's good, although then I think there will need to be an api option to say what the secrets-oauthapp plugin will put into the requested_token_type parameter. It could default to access_token but we'll need an option to choose refresh_token.

I wouldn't anticipate adding refresh_token, code, etc. to this particular endpoint (PUT creds/parent/tokens/foo), I don't think. To wit: if you already have a refresh token externally and you want to do a token exchange with it to get a new access token, what value does this plugin provide (I think of value being Vault's secure storage, primarily)?

Refresh tokens are assigned to particular client ids. Only the plugin has the client secret to go with the client id; the refresh token isn't of any value without them. That's why I thought it was fine for the auth-jwt plugin to return it to the client which passes it to the secrets-oauthapp plugin. My first PR to supply that functionality to auth-jwt did have it directly copy the refresh token to the secrets path in vault, but the plugin owners rejected that so it has to go through the client. The RFC doesn't explicitly require token exchange to be authenticated but it does discourage unauthenticated exchanges: "Note that omitting client authentication allows for a compromised token to be leveraged via an STS into other tokens by anyone possessing the compromised token." That's especially important for refresh tokens.

impl commented 3 years ago

The access token for the RFC 8693 token exchange would come from the parent token. But the token exchange itself could also return a refresh token (similar to the initial authorization code exchange), which we would store securely and not expose to the user. This lets the exchanged token refresh independently from the parent token.

Ok that's good, although then I think there will need to be an api option to say what the secrets-oauthapp plugin will put into the requested_token_type parameter. It could default to access_token but we'll need an option to choose refresh_token.

In the RFC it looks like you can always request an access token and still optionally return a refresh token alongside that requested access token, but this is probably a fine option to expose. It would just have one more round-trip to then exchange the refresh token for another access token, if that's the way the server is designed.

I wouldn't anticipate adding refresh_token, code, etc. to this particular endpoint (PUT creds/parent/tokens/foo), I don't think. To wit: if you already have a refresh token externally and you want to do a token exchange with it to get a new access token, what value does this plugin provide (I think of value being Vault's secure storage, primarily)?

Refresh tokens are assigned to particular client ids. Only the plugin has the client secret to go with the client id; the refresh token isn't of any value without them. That's why I thought it was fine for the auth-jwt plugin to return it to the client which passes it to the secrets-oauthapp plugin. My first PR to supply that functionality to auth-jwt did have it directly copy the refresh token to the secrets path in vault, but the plugin owners rejected that so it has to go through the client. The RFC doesn't explicitly require token exchange to be authenticated but it does discourage unauthenticated exchanges: "Note that omitting client authentication allows for a compromised token to be leveraged via an STS into other tokens by anyone possessing the compromised token." That's especially important for refresh tokens.

Ahh, duh, makes sense! So I guess the question is, given a refresh token or authorization code, can we do a one-shot authorization code exchange (if needed) + token exchange? I think the answer is probably yes -- will give some thought to how that might look structurally.

DrDaveD commented 3 years ago

In the RFC it looks like you can always request an access token and still optionally return a refresh token alongside that requested access token

Yes but I think a refresh token isn't always wanted. One token issuer I was testing with was sending a refresh token at first with a token exchange even though I hadn't requested it, but at the same time it invalidated the original refresh token because that's what it usually does when a new access token is requested (in order to keep the refresh token refreshed). I wasn't saving the refresh token with a token exchange, because I normally didn't want a reduced-scope refresh token, but then every other time I requested an access token I had to start over from scratch to request a new refresh token. So now the token issuer doesn't send refresh tokens by default anymore with a token exchange. As I think about it now though, I guess it would have been OK for it to always send it, as long as it didn't invalidate the old one. In any case if I want to take advantage of this new feature you're working on, I'll have to ask them to send a refresh token without invalidating the old one. Still I think it's probably better for it to be optional, because token issuers have to keep track of all outstanding refresh tokens so they have the ability to revoke them.

So I guess the question is, given a refresh token or authorization code, can we do a one-shot authorization code exchange (if needed) + token exchange? I think the answer is probably yes -- will give some thought to how that might look structurally.

That might be a nice shortcut but I don't think it's necessary so if it's complicated it could be skipped. I guess it would have to be a PUT to a subpath but have a side effect of writing the refresh token to the parent path before proceeding to the token exchange.