oauth2_introspection - Githubissues

llorllale commented 3 years ago

Describe the bug

The oauth2_introspection authenticator expects the oauth2 access_token to be in plain form in the Authorization header. In contrast, OAuth2 Bearer Token Usage specifies the token should be base64-encoded in the header (similar to the HTTP basic authorization scheme).

Reproducing the bug

Steps to reproduce the behavior:

Enable the oauth2_introspection authenticator in the global config:

authenticators:
oauth2_introspection:
enabled: true
config:
  introspection_url: https://hydra.example.org:4445/oauth2/introspect

Enable the authenticator in an access rule
Invoke the HTTP endpoint with the access rule. Construct the Authorization header as per RFC6750.

Expected behavior

Given a valid access_token, the request should be authorized.

Environment

Version: v0.38.4-alpine
Environment: alpine

Additional context

The issue can be worked around by sending the access_token in plain form.

aeneasr commented 3 years ago

Thank you for investigating this. However, I do not think that this is a correct reading of the spec. The token characters need to be base64 vocabulary, but there is no encoding involved. ORY Oathkeeper is behaving correctly here.

llorllale commented 3 years ago

Hi @aeneasr

On the topic of the specs:

RFC6750 says:

The syntax of the "Authorization" header field for this scheme follows the usage of the Basic scheme defined in Section 2 of [RFC2617].

From section 2 of RFC2617:

To receive authorization, the client sends the userid and password, separated by a single colon (":") character, within a base64 [7] encoded string in the credentials.
 basic-credentials = base64-user-pass
 base64-user-pass  = <base64 [4] encoding of user-pass,

                  except not limited to 76 char/line>
 user-pass   = userid ":" password
 userid      = *<TEXT excluding ":">
 password    = *TEXT
Userids might be case sensitive.

If the user agent wishes to send the userid "Aladdin" and password "open sesame", it would use the following header field:
 Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

Note how Aladdin:open sesame is base64-encoded to the value QWxhZGRpbjpvcGVuIHNlc2FtZQ==.

llorllale commented 3 years ago

If you look at the access_token syntax (RFC6749):

A.12. "access_token" Syntax

The "access_token" element is defined in Sections 4.2.2 and 5.1:
 access-token = 1*VSCHAR

Where VSCHAR is defined at the start of appendix A as:

VSCHAR = %x20-7E

Which according to this ASCII table includes characters outside the base64 encoding space such as curly braces, vertical pipe, angled brackets, etc.

aeneasr commented 3 years ago

I am 100% confident that the client does not need to do any base64 encoding before sending the access token in the Authorization: Bearer <token> header. No API and no certified client is doing that anywhere. It is possible that the server returns a base64 encoded token to the client, and the client uses that token - without caring about its format. But the client is not responsible for en- or decoding this.

If we were to change this, we would break any and all existing integrations. I am sure that this is a misunderstanding of the sections in the various RFCs. These RFCs are often vague and lack detail and one has to look very deep into all of the linked documents and language. I can only tell you what we have been doing for 5+ years, and what others in this field are doing - which is what is currently implemented.

llorllale commented 3 years ago

I will attempt to provide a bit more of a balanced answer here.

TLDR: RFC6750 specifies the OAuth token needs to be base64-encoded (RFC2045), however most/all resource servers that support the OAuth2 Bearer Token usage process the token in plain form in the Authorization header. See for example Microsoft Identity Platform or GitHub's Authorizing OAuth Apps.

RFC6750: OAuth 2.0 Bearer Token Usage

This is the normative source of truth for how to use OAuth 2.0 Bearer tokens. As I've said in previous above, Section 2.1, as written, indicates the Authorization header's syntax follows the usage of the Basic scheme defined in RFC2617 Section 2. The syntax for base64-user-pass in RFC2617 references RFC 2045 - Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. Working our way backwards:

RFC2045 defines a mapping for base64-encoding content
RFC2617 references this mapping when defining the encoding of the Basic HTTP authentication scheme
RFC6750 in turn references RFC2617, declaring its "syntax follows the usage of the Basic scheme"

It should not be a surprise to anyone that a developer reading these RFCs may conclude that the token must be base64-encoded using the mapping defined in RFC2617.

The fact that RFC6749 defines a syntax for the access_token with a wider character space than that defined in RFC2617 reinforces the understanding that the token must be base64-encoded (ie. "projected" into the character space defined in RFC6750) in the Authorization header.

The reader may furthermore ask: why would an RFC conflate the concept of a "Bearer" token with its format? From the terminology of RFC6750 (again, the normative spec for OAuth 2.0 Bearer Token Usage):

A security token with the property that any party in possession of the token (a "bearer") can use the token in any way that any other party in possession of it can. Using a bearer token does not require a bearer to prove possession of cryptographic key material (proof-of-possession).

Yes, that is the definition and common understanding of what a "bearer" token is. This definition is completely divorced from - and does not mention at all - the token's format.

Why do implementations accept the token as-is?

I can only speculate with the limited information at my disposal.

I suspect part of the reason is one of performance and use of resources: don't have the client (potentially resource-constrained) base64-encode a token that may already be encoded by the AS, thus sending an even bigger object in the header.

I suspect another reason is the lack of consistency and precision in some areas of the text. The normative text says one thing, yet the examples say another.

I think RFC6750 could have been written with more clear and precise language. It would not have been difficult to copy some of the same language used in RFC6749 Section 7.1 over into RFC6750. There could have been written sections separate for implementors of clients systems vs authorization servers. I suspect the editors tried to cover both sides with a single terse section. The result is misinterpretations like this.

RFC 6749 - The OAuth 2.0 Authorization Framework Section 7.1 provides an example of how "bearer" tokens are used, and it says it is simply included in the request (ie. not base64-encoded). Note that this is not the normative spec for OAuth 2.0 Bearer Token usage. Even the example in Section 2 of RFC6750 is not base64-encoded.

aeneasr commented 3 years ago

Thank you for the great write up! The problem with unclear messaging and specs in IETF RFCs is pervasive, especially around OAuth2, then OpenID specs, and JWT spec.

ory / oathkeeper

oauth2_introspection #597