ory / hydra

The most scalable and customizable OpenID Certified™ OpenID Connect and OAuth Provider on the market. Become an OpenID Connect and OAuth2 Provider over night. Broad support for related RFCs. Written in Go, cloud native, headless, API-first. Available as a service on Ory Network and for self-hosters.
https://www.ory.sh/?utm_source=github&utm_medium=banner&utm_campaign=hydra
Apache License 2.0
15.65k stars 1.5k forks source link

Token claims customization with Jsonnet #1748

Closed redbaron closed 1 year ago

redbaron commented 4 years ago

Is your feature request related to a problem? Please describe.

Migrating from Keycloak requires flexible token customization capabilities

Describe the solution you'd like

To keep it flexible, yet lightweight, I am thinking of having a jsonnet snippet as part of a client description. It will be given some rich enough context as an input and would produce additional fields to be added to a token.

Example of jsonnet snippet configured for the client:

{ 
  scopes: ctx.accessRequest.granedScopes,   //
  roles: ctx.metadata.roles // client's metadata object is injected as part of context  
}

At the start, snippet wont be able to alter any existing properties, only introduce new ones.

Describe alternatives you've considered

-

Additional context

I am mainly focusing on client_credentials grant for issuing tokens for service to service communication.

How does it fit into token introspect for non-JWT tokens?

I'll be able to work on this feature provided we agree on design details upfront

redbaron commented 4 years ago

@tacurran , before starting, I'd like to get some general thumbs up on the idea. Then there are implementation details worth discussing:

  1. Given that Jsonnet snippet is a user defined program running on a server side, it is worth discussing ABI, more specifically "context" object exposed to it as it can't be changed freely without breaking said programs: new fields can be added, but existing can't be removed/changed type or semantic

  2. How does UX look like in case of error? Jsonnet script can fail to produce result, do we fail token generation or we generate it just without any customizations said script intended to do? If we fail, how much of a stacktrace/error we give back to user in a response?

  3. Jsonnet snippet results will be integrated into JWT , what about token introspection? Can it be used there somehow and should it?

  4. What is going to happen to sessions on snippet update? Snippet by definition alters access token, does Hydra start producing newly shaped tokens for existing sessions? In other words, should Jsonnet snippet version be stored in a session iteself? How does it work now, where lets say list of allowed scopes or audiences can change and then session refreshes token where set of scopes violates new client configuration?

aeneasr commented 4 years ago

I'm not sure if this is the best approach to take here. If you have access to the token, you have access to the subject (in the case of client_credentials that would be the client_id), and thus have access to any additional information you might need for look up.

It sounds to me like you want to use tokens as a substitute for sessions, which is the wrong approach to take. I understand that some limited information might make sense to mint into the token, but the scope should be really narrow.

Adding something that executes arbitrary code on a per-client basis (which is a possibly public-facing API for OIDC Dynamic Client Registration) is pretty dangerous. It would also (potentially) allow attackers to mint tokens with data that can lead to privilege escalation.

We do have an open issue (I think) which wants to set extra token data similar to what we do with the authorize_code grant for client_credentials but it is kind of tricky to implement in a way that doesn't leave a potential gaping security hole wide open.

tacurran commented 4 years ago

@redbaron the consensus today is that implementing the changes you request in this issue would potentially weaken token and hydra security. While we admittedly have done limited research into the keycloak "flexible token" approach you mention in the issue, we would rather keep to today's token structure, while keeping options open for the future. You wanted to compare keycloak to hydra, I think. Perhaps you can find other developers that have a similar requirement? This would be one way to get something on the future roadmap. Therefore, we will leave this issue open. Should it become stale, we would then close it. While we really value your input and ideas, we hope you can accept this approach. @aeneasr please chime in here.

redbaron commented 4 years ago

I probably started with too many details, while idea of token customization should have been discussed first. I don't want to divert discussion on the reason of such discussion as it is not the point, it exists in other products, used by many companies and if ORY stack wants to be competitive on this fields it should offer similar feature.

All big players in this market allow tokens customization

Here are few:

These customizations are expressed either using convoluted UI (Keycloak, WSO2) or unsafe scripting, thus requiring sandboxing (Auth0), declarative rules "DSL" (OKTA), or set as a static values (Firebase)

All these products check that customization doesn't touch "protected" claims, therefore do not compromise security of issued tokens.

@tacurran , @aeneasr , you gave conflicting responses on whether even minimal form of token customization is something you'd be happy to see in Hydra.

Do you think, that case presented is strong enough to entertain idea that hydra should support token customization?

UX

If answer to above is yes, then next point of discussion is how to express them? It is important to reiterate, that once customization engine produces desired state of the token, it is then to be verified to not have any protected claims, thus making customizations safe.

Existing products on the market picked various approaches.

Static set of claims

Simple object ({ claim1: "..." , claim2: [000] }) is configured per client and added to all tokens issued for that client. It is simplest to implement and understand, but it is too coarse for many use-cases as it can't express any decisions to be made at token issue time.

UI

Out of the question for Hydra, so no point discussing pros and cons

Scripting language

Customizations might be expressed as a program in one of scripting languages (Lua, Javascript, anything else). For safe operation it requires either Go interpreter or external process with sandboxing. Feels too heavyweight for this task, but definitely most expressive and easy to adopt by users.

Rules DSL

Some kind of rules engine, where step by step changes to token accumulated and then final result produced. IMHO it is least ergonomic option. Example of this approach is OKTA.

Declarative DSL producing JSON

There many DSL very well suited to produce JSON as it's output. Think of jq, but on a server side. Some examples:

IMHO it is a sweet spot for this task. These DSLs are geared for generating JSON, they are not imperative languages, so no infinite loops, they are pure and as a result safe to run on a server side.

They are expressive enough to cover complicated cases your users might have.

Out of these I personally would vote for Jsonnet, but I'd take any of them over any other option.

aeneasr commented 4 years ago

Hydra supports token customization for all grants except client_credentials already. Are you looking specifically for customization of tokens issued by the client_credentials grant?

redbaron commented 4 years ago

Hydra supports token customization for all grants except client_credentials already

yes and no.there is a way to customize tokens , yes, but customization is not configured on a authorization server side, therefore is "optional" and not enforced. In the same way Hydra already capable of limiting audience and scope per client, there are use cases where custom, user-defined claims should also be controlled by hydra, not client.

Hydra supports token customization for all grants except client_credentials already

This is what prompted me to open this issue, yes, but I dont think this feature should be limited to client_credentials grant only

aeneasr commented 4 years ago

yes and no.there is a way to customize tokens , yes, but customization is not configured on a authorization server side, therefore is "optional" and not enforced.

Sorry, I don't understand what you mean. Did you read the docs? This is of course enforced on your login app. You have absolute control over the logic and thus over the flow and thus over the token payloads.

And yes, this works per client, per user, per day, per IP, per user agent, per whatever - you have absolute control. Please consult the documentation again.

This is what prompted me to open this issue, yes, but I dont think this feature should be limited to client_credentials grant only

As stated previously, OAuth2 Clients are actually throwaway things. They are generated by third parties not under your control - sometimes in automated ways - with payloads you don't control. While some providers like GitHub have a some type of server-side validation going on, other's that implement OIDC Dynamic Client Registration (we do) do not.

This leaves a huge attack surface open for developers that do not know about this feature or the effect that it can have, and that leave the /clients endpoint open to the public in order to implement OIDC DCR. Unless we put some effort into the DCR spec such as creating the same endpoint with some safeguards on the public port, there is no way this endpoint will be able to execute any type of logic that modifies tokens.

The only thing we have come up so far would be an RPC call for the client_credentials grant but that's just terrible from a developer experience.

I don't have exhaustive experience with WSO2, Okta, and so on but as far as I know and have observed, these "custom claims" are actually for the ID Tokens and Access Tokens issued as part of user-facing flows (authorization_code, implicit, ...). I know that Auth0 offers this capabilities (as you mentioned) by executing JavaScript in their WebWorker VMs (I think that's what they call it) but that's basically equivalent to an RPC call.

aeneasr commented 4 years ago

Am I correct in assuming that this can be merged with #1383 ?

redbaron commented 4 years ago

Did you read the docs? This is of course enforced on your login app. Please consult the documentation again.

Yes I did, even before opening this question. I'd appreciate if you give benefit of a doubt to community members offering help both in providing feedback and writing code.

This issue is about token customization applied/enforced by authorization server. Unless I am mistaken you are talking about session property in /oauth2/auth/requests/consent request body submitted by consent app. This feature leaves control over token customizations to application side, not authorization server. These are two entities, often developed, provisioned and audited by different teams if not companies.

They are generated by third parties not under your control - sometimes in automated ways - with payloads you don't control. ... and that leave the /clients endpoint open to the public in order to implement OIDC DCR.

As with any authorization server, there are multiple deployment strategies, some of which may not allow dynamic client registration, plus as per docs /clients shouldn't be left public anyway. Given that Hydra is already capable of enforcing audience and scopes client can have in their tokens, self-registration/self-update by the clients themselves, where they can update these constraints, is already a dubious practice.

Unless we put some effort into the DCR spec such as creating the same endpoint with some safeguards on the public port, there is no way this endpoint will be able to execute any type of logic that modifies tokens.

All other products on the market allow this customization and don't see it as a security threat. Could you elaborate what's your concern allowing client specifying customizations applied to it's own tokens, provided that claims can only be added, not modified?

these "custom claims" are actually for the ID Tokens and Access Tokens issued as part of user-facing flows (authorization_code, implicit, ...).

At least Keycloak and OKTA 100% can apply these customizations to tokens issued for client_credentials grants.

aeneasr commented 4 years ago

Yes I did, even before opening this question. I'd appreciate if you give benefit of a doubt to community members offering help both in providing feedback and writing code.

Revisiting my comment, I have chosen an overly harsh tone which was not my intention and I am sorry about that (please keep in mind that I’m not a native speaker and Germans are known for their direct language ;) ) comments are thoughtful and with research and that is greatly appreciated, even if there is disagreement!

This issue is about token customization applied/enforced by authorization server. Unless I am mistaken you are talking about session property in /oauth2/auth/requests/consent request body submitted by consent app. This feature leaves control over token customizations to application side, not authorization server. These are two entities, often developed, provisioned and audited by different teams if not companies.

The assumption is not correct from our viewpoint. The login (authn) consent (authz) app/endpoint is part of the authorization server. It controls who, scope, audiences, and more and should of course be controlled by the same team and audited using the same process. Therefore my previous comments / misunderstanding.

As with any authorization server, there are multiple deployment strategies, some of which may not allow dynamic client registration, plus as per docs /clients shouldn't be left public anyway. Given that Hydra is already capable of enforcing audience and scopes client can have in their tokens, self-registration/self-update by the clients themselves, where they can update these constraints, is already a dubious practice.

Audience and scope are enforced by the consent endpoint and can be locked down. One problem you uncover here though is that many developers use scope as permissions, which they are not. The token scope could say „admin privileges have been granted“ even if you’re not an admin user in the system.

Given this situation and also that we don’t issue client access tokens for modifications on that client we should probably have a public endpoint that specifically does OIDC DCR. This would also allow the admin endpoint to have the ability to specify claim customization although I’d really like to avoid scripting languages - maybe though via go templates?

All other products on the market allow this customization and don't see it as a security threat. Could you elaborate what's your concern allowing client specifying customizations applied to it's own tokens, provided that claims can only be added, not modified?

If misconfiguration allows attackers to customize claim tokens, we are in deep shit as you would be if session cookies were just unencrypted/unsigned json strings. Other products do not allow this on their public facing APIs but as part of an admin flow.

What do you think?

redbaron commented 4 years ago

I think it is important to clarify, that I view Authorization server as a internal component of a bigger company infrastructure, where there are multiple first-party services using it's tokens to make calls to each other. Authorization server applies coarse control by limiting allowed audiences per client, so that individual teams developing services cannot silently grant themselves elevated access other than issuing PR to reconfigure Hydra.

I realise it is a different model from a "Github-like" setup, where there is a main application and multiple, possibly dynamically generated, clients all use tokens to call that "main" app and never call each other. This can be a source of misunderstanding as we see role of Hydra differently.

The login (authn) consent (authz) app/endpoint is part of the authorization server.

Consent app is a unique feature of Hydra, it allows consent screen to be integrated into a client application itself. IMHO integrating consent doesn't make app an inherit part of authorization server, because moving where consent UI is rendered from shouldn't move trust. App implementing consent flow remains a separate entity and should be treated as such, for instance should still be subject to any imposed limits on what tokens it can issue. From this POV, unrestricted session in the consent flow can be seen as a misfeature, but I am not arguing against it in this issue.

What I'd like to get instead is to be able to create customized tokens, while leaving Hydra to be an ultimate authority.

Audience and scope are enforced by the consent endpoint and can be locked down.

Audience and scope limits are configured during client registration. Can consent endpoint override them, or must use a subset of client's allowed audience and scopes? My understanding, is that complete unrestricted override is not possible, which means that Hydra remains in full control and trust was not shifted to application.

Given this situation and also that we don’t issue client access tokens for modifications on that client we should probably have a public endpoint that specifically does OIDC DCR. If misconfiguration allows attackers to customize claim tokens, we are in deep shit as you would be if session cookies were just unencrypted/unsigned json strings.

OK, lets scope this change to clients configured via admin flow only.

This would also allow the admin endpoint to have the ability to specify claim customization although I’d really like to avoid scripting languages - maybe though via go templates?

I'd like to avoid scripts as well, that's why I suggested to use Jsonnet or any other DSL well suited to generate structured data. Great thing about CUE and Jsonnet is that simple JSON is a valid "program", therefore users can start using this feature without much effort.

Go templates operate on text and writing templates using it are very error prone, plus it is very cumbersome (or plain impossible without introducing helper functions via template.FuncMap) to express things like list/object comprehensions, basically it fails anywhere outside of { "claim1": "{{ .Value }}"}. I am aiming at expressiveness comparable to OKTA's, while still being minimalistic and ergonomic, so Go template is not a good fit IMHO, "full" scripts are overkill/unsafe and Jsonnet/Cue DSLs are best fit.

aeneasr commented 4 years ago

I think it is important to clarify, that I view Authorization server as a internal component of a bigger company infrastructure, where there are multiple first-party services using it's tokens to make calls to each other. Authorization server applies coarse control by limiting allowed audiences per client, so that individual teams developing services cannot silently grant themselves elevated access other than issuing PR to reconfigure Hydra.

I realise it is a different model from a "Github-like" setup, where there is a main application and multiple, possibly dynamically generated, clients all use tokens to call that "main" app and never call each other. This can be a source of misunderstanding as we see role of Hydra differently.

That may be true for your understanding of an Authorization Server, but it is not true for OAuth2. The scope of OAuth2 is very clearly defined in the abstract of RFC 6749:

The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf. This specification replaces and obsoletes the OAuth 1.0 protocol described in RFC 5849.

The GitHub model is OAuth2, your understanding is not this protocol nor is it OpenID Connect. I think it is very important to clarify this because there are many sharlatans out there that try to sell you OAuth2 as the cure for everything but let me be as clear as I can: It is not.

Google, Microsoft, GitHub, Netflix, whoever has serious security engineers in place does not use OAuth2 for first-party authn nor authz. It is a protocol for third parties and that’s all it is really good for.

Consent app is a unique feature of Hydra, it allows consent screen to be integrated into a client application itself. IMHO integrating consent doesn't make app an inherit part of authorization server, because moving where consent UI is rendered from shouldn't move trust. App implementing consent flow remains a separate entity and should be treated as such, for instance should still be subject to any imposed limits on what tokens it can issue. From this POV, unrestricted session in the consent flow can be seen as a misfeature, but I am not arguing against it in this issue.

I understand your point but that’s not the design we chose and I believe this is clarified in the docs and from the privileged rights the login/consent app have (you can literally impersonate someone if you have access to the login app).

What I'd like to get instead is to be able to create customized tokens, while leaving Hydra to be an ultimate authority.

I want to clarify that Hydra is an OAuth2 server and not a „token for session“ service. You may be better advised to use AWS KMS or something similar if you want to create/mint tokens from a central authority for internal use.

You may be disappointed by this reply but a guiding principle of our ecosystem is that we solve problems well within a clear defined boundary. If you need OAuth2, Hydra is a pretty good choice but if you need mTLS protected IPC with granular and tight employee control over a large network, you need to pick another solution. For example something like SAML for interoperability and flexibility.

Hydra can and will not replace an internal „enterprise“ zero trust network because OAuth2 is the wrong protocol to achieve that and if you follow the RFCs and real-world OAuth2 use you will probably come to the same conclusion.

Having said that, custom claims for client_credentials are still a good idea as is offering an additional endpoint for OIDC DCR!

redbaron commented 4 years ago

You may be better advised to use AWS KMS or something similar if you want to create/mint tokens from a central authority for internal use.

I intentionally tried to avoid reasons of these customizations. Clearly they are in demand by clients and clearly their use can be seen as misuse by some (or all) experts in the field. Nevertheless, it is an opportunity for Hydra not only match feature, but also get ahead of the game by making it much more ergonomic.

Having said that, custom claims for client_credentials are still a good idea as is offering an additional endpoint for OIDC DCR!

Lets discuss implementation details then.

Public OIDC DCR

I suggest use existing admin registration code as is, but reject certain fields from client definition when invoked via public DCR. Initially only token customization configuration is to be rejected.

Token customization configuration

What's you stance on how configuration might look like? Did you give a thought about proposed DSLs for this task?

other flows

Do I understand it right, that you'd prefer token client's customization to be applied only in client_credentials flow and ignored in any other flows?

aeneasr commented 4 years ago

Sorry for the late reply!

I suggest use existing admin registration code as is, but reject certain fields from client definition when invoked via public DCR. Initially only token customization configuration is to be rejected.

Makes sense!

What's you stance on how configuration might look like? Did you give a thought about proposed DSLs for this task?

I would like to keep it as simple as possible. Maybe we just include the metadata in the token or add another client field called custom_access_token_claims (or something similar).

Do I understand it right, that you'd prefer token client's customization to be applied only in client_credentials flow and ignored in any other flows?

Yes because we already have custom claims for all other flows as explained in previous comments and adding another possibility to modify the same behavior is confusing.

redbaron commented 4 years ago

I suggest use existing admin registration code as is, but reject certain fields from client definition when invoked via public DCR. Initially only token customization configuration is to be rejected.

Makes sense!

On another thought, to register client with DCR one already have to have a client credentials. Maybe make it a property of the client instead? Something like can_register_token_claims. Admin and public client registration endpoints remain the same then.

or add another client field called custom_access_token_claims (or something similar).

This covers my use-case, but does not allow applying any logic, where claims added conditionally.

aeneasr commented 4 years ago

On another thought, to register client with DCR one already have to have a client credentials. Maybe make it a property of the client instead? Something like can_register_token_claims. Admin and public client registration endpoints remain the same then.

No, typically registration is open and a bearer token (not an access token, I think it's called registration token) is needed to make modifications to the client. Alternatively, the client id and secret could be used to make these modifications.

Additionally, having valid client credentials does not mean that you should have any authority whatsoever over anything related to the claims of the issued access tokens.

This covers my use-case, but does not allow applying any logic, where claims added conditionally.

I know but we usually do these things in incremental changes. If there is indeed logic required and the benefits outweigh the downsides of e.g. implementing JSONET then we'll consider the best path to add that feature!

overshareware commented 3 years ago

Hi, I noticed this ticket is listed on the roadmap for 1.11; can you confirm if you're going to be implementing this as described?

I've been evaluating Hydra for use in some of our infrastructure and have a very similar scenario to what @redbaron described. I'm interested in the timeline for being able to manage claims like this.

aeneasr commented 3 years ago

While I understand that a date of delivery is helpful when planning, we stopped giving out due dates for features or milestones. It puts maintainers in an unfair spot as

  1. there are many things going on in the community which need time (reviewing PRs, addressing issues, answering questions) and demand is no not easily foreseeable;
  2. maintainers do a lot of things in their free time and can not commit to deliver something;
  3. internal priorization can shift at any point in time

When we do give out due dates, we often encounter people who are pissed off because something was promised and not delivered "on time" (as free software ;) ). Unfortunately that is the sad truth of many interactions in this context which is why we stopped doing it altogether.

Having said that, you are in the privileged position of being part in a global open source community! If you think that you can contribute towards the particular feature, maintainers will do their best to give you the right pointers and review PRs.

If that's not an option, you can consider to become a Ory Sponsor on Open Collective or Patreon. This helps us to employ more maintainers and technical staff - increasing velocity in development, community, interaction, and other areas! All collected money goes directly into this.

overshareware commented 3 years ago

Sorry, I wasn't trying to grill you on a date; I understand your point and appreciate all the work you are doing!

I was mostly just interested in an update on where you were at mentally with a solution to this problem, and what to do about the client_credentials claim. I really like Hydra and it fits our needs much, much better than Keycloak, etc. The one thing I'm wrestling with is the service accounts scenario, similar to what the OP described.

raman-nbg commented 2 years ago

What is the current state of this issue? This feature is mission critical for me to use ORY hydra. If you want I could try to implement this - but I will need guidance and assistance here because I'm new to go lang.

aeneasr commented 2 years ago

For anyone looking, a good place to start for the client credentials flow is here:

https://github.com/ory/hydra/blob/0a73d8be3639372fe9830a65df1334842888814b/oauth2/handler.go#L590-L627

As you can see we are setting some values in this block:

https://github.com/ory/hydra/blob/0a73d8be3639372fe9830a65df1334842888814b/oauth2/handler.go#L604-L605

And basically running that through JsonNet

raman-nbg commented 2 years ago

I have some questions regarding to the feature specification:

  1. Where should someone specify the jsonnet 'program'? I think the best approach would be to define this as a deployment specific configuration. Or is there any need to define this somehow dynamically (per client, ...)? Any other ideas?
  2. In my case I want to set claims based on the client's metadata. Therefore, I need to pass the client object as a context argument to the jsonnet VM. Is this a good approach? If yes, is there also something else that we also need/want to provide as a context argument?
  3. Should we add jsonnet as a "default" dependency to hydra? Can this have any unintentional side-effects?
aeneasr commented 2 years ago

Where should someone specify the jsonnet 'program'? I think the best approach would be to define this as a deployment specific configuration. Or is there any need to define this somehow dynamically (per client, ...)? Any other ideas?

We usually allow these files to be loaded from the filesystem (file://...), from remote (http(s)://), or from base64 inline (base64://)

In my case I want to set claims based on the client's metadata. Therefore, I need to pass the client object as a context argument to the jsonnet VM. Is this a good approach? If yes, is there also something else that we also need/want to provide as a context argument?

Absolutely! But keep in mind that the metadata might be changed by the client themselves and it does not represent privileged information.

Should we add jsonnet as a "default" dependency to hydra? Can this have any unintentional side-effects?

We can add it as a default dependency :)

github-actions[bot] commented 1 year ago

Hello contributors!

I am marking this issue as stale as it has not received any engagement from the community or maintainers for a year. That does not imply that the issue has no merit! If you feel strongly about this issue

Throughout its lifetime, Ory has received over 10.000 issues and PRs. To sustain that growth, we need to prioritize and focus on issues that are important to the community. A good indication of importance, and thus priority, is activity on a topic.

Unfortunately, burnout has become a topic of concern amongst open-source projects.

It can lead to severe personal and health issues as well as opening catastrophic attack vectors.

The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone.

If this issue was marked as stale erroneously you can exempt it by adding the backlog label, assigning someone, or setting a milestone for it.

Thank you for your understanding and to anyone who participated in the conversation! And as written above, please do participate in the conversation if this topic is important to you!

Thank you 🙏✌️