haproxytech / haproxy-lua-oauth

JWT Validation implementation for HAProxy Lua host
Apache License 2.0
105 stars 50 forks source link

Add Audience(s) to Variable #18

Closed pfzetto closed 3 years ago

pfzetto commented 3 years ago

When using HaProxy as an API Gateway it might be required to have differnet Audiences for different backends. This Commit added a Variable called txn.audience. This can be used to validate the Audience by an ACL Rule.

Bsp.: http-request deny unless { var(txn.audience) -m sub " audienceA " }

The leading and trailing whitespaces are there to prevent false flags by the substring method. Bsp.: http-request deny unless { var(txn.audience) -m sub "audienceA" }
audienceA -> pass
fooaudienceA -> pass
audienceAbar-> pass
fooaudienceAbar -> pass
http-request deny unless { var(txn.audience) -m sub " audienceA " }
audienceA -> pass
fooaudienceA -> deny
audienceAbar-> deny
fooaudienceAbar -> deny

This change only works when no Audience is set as the Env Variable.

NickMRamirez commented 3 years ago

Thanks @pfz4 I will take a look at your changes.

NickMRamirez commented 3 years ago

Multiple audiences seems like something that would be good to support, but I am not sure that this is the best approach because we already have a function named audienceIsValid is the Lua code, and it seems convoluted to:

  1. Set no audience environment variable to have no value
  2. Perform the validation outside of the Lua code as an ACL, when there's already the audienceIsValid function built in.

Instead, I propose:

  1. We should change the audienceIsValid function so that it can accept one or multiple audience values. The easiest way to do that is probably to allow the audience environment variable to be a comma-delimited list of values, and then in the Lua code split the string apart into a table with the core.tokenize function. Then, search the table to see if the user-supplied audience is in the list. In the haproxy-lua-cors library, I do exactly that in the get_allowed_origin function. That function also calls a function named contains to search the table for a value (although it uses a regex, which I don't think we'd need to do in this case).

  2. We can also publish the audience as a variable so that it's available to ACLs for additional processing, but I think that the audienceIsValid function should do the primary testing of the value.

NickMRamirez commented 3 years ago

As for validating a substring...if we do this, then it would need to be clear that that is going to happen, such as by having the user put an asterisk onto the end or beginning of the string. In the past, I have not been in favor of adding stuff like that because it's not part of the spec, and I would prefer to keep the library simple.

pfzetto commented 3 years ago

Ok, I will change the audienceIsValid Function an add a custom HaProxy Function to Validadte the Audience (https://www.haproxy.com/de/blog/5-ways-to-extend-haproxy-with-lua/):

core.register_fetches("validateAudience", function(txn, aud)
    -- Verify Code
    return false
end)

http-request deny if { lua.validateAudience("audienceA") -m bool }

This way we don't have the substring matching problems

pfzetto commented 3 years ago

Implemented Multiple Audiences (seperated by , in config) Only one of the audiences in the token has to be in the config setenv OAUTH_AUDIENCE audience1,audience2,audience3

Added an validateAudience Fetch The validateAudienceFetch is used like that in HaProxy: http-request deny unless { lua.validateAudience("audience") -m bool }

pfzetto commented 3 years ago

@NickMRamirez Implemented the requested changes

NickMRamirez commented 3 years ago

Hmm, the changes seem more complex than I thought they would be. I'll see if I can condense this down. Also, I want to see if I can expose all of the claims in a token as variables, since there's value in being able to see, for example, the client ID. We already store the "scope" claim, so the framework is there, and we shouldn't need to add a new fetch to get them.

pfzetto commented 3 years ago

I haven't found a way to Match for a Array Variable in Haproxy other than Substring Matching, which can lead to false positives. The Code is so complex to Support multiple Input capes: Token Aud Array, validation Input array Token Aud Array, validation Input string Token Aud String, validation Input Array Token Aud String, validation Input string

Maybe I can look into converting the string in size 1 Arrays before checking

NickMRamirez commented 3 years ago

@pfz4 I tested this code with the following combinations:

It converts' aud' and OAUTH_AUDIENCE to tables, even when there is only one value. Then it loops through those tables to compare if they contain a match.

I think that's all of the possible combinations.

local function contains(items, test_str)
    for _,item in pairs(items) do

      -- strip whitespace
      item = item:gsub("%s+", "")
      test_str = test_str:gsub("%s+", "")

      if item == test_str then
        return true
      end
    end

    return false
end

local function audienceIsValid(token, expectedAudienceParam)

    -- Convert OAUTH_AUDIENCE environment variable to a table,
    -- even if it contains only one value
    local expectedAudiences = expectedAudienceParam
    if type(expectedAudiences) == "string" then
      expectedAudiences = core.tokenize(expectedAudienceParam, ",")
    end

    -- Convert 'aud' claim to a table, even if it contains only one value
    local receivedAudiences = token.payloaddecoded.aud
    if type(token.payloaddecoded.aud) == "string" then
      receivedAudiences = core.tokenize(token.payloaddecoded.aud, ",")
    end

    for _, receivedAudience in ipairs(receivedAudiences) do
      if contains(expectedAudiences, receivedAudience) then
        return true
      end
    end

    return false
end

Next I will investigate returning all of the token claims as variables that you can access in the haproxy.cfg.

pfzetto commented 3 years ago

@NickMRamirez looks good! But I would change receivedAudiences = core.tokenize(token.payloaddecoded.aud, ",") to Match specification:

From: https://openid.net/specs/openid-connect-core-1_0.html

aud REQUIRED. Audience(s) that this ID Token is intended for. It MUST contain the OAuth 2.0 client_id of the Relying Party as an audience value. It MAY also contain identifiers for other audiences. In the general case, the aud value is an array of case sensitive strings. In the common special case when there is one audience, the aud value MAY be a single case sensitive string.

Will you keep the validateAudience Fetch?

NickMRamirez commented 3 years ago

Do you mean, change the code to match the specification so that it adds the client_id to the audience?

It must be left up to the auth token providers to send a valid token. Since the token is signed, I won't be able to alter it, such as to add the client ID to the audience.The token will be passed to the backend servers as the Authorization header as-is. But, we can add code to parse the information into variables so that HAProxy can read it.

Interestingly, I don't think that all of the auth providers send the Client ID in the audience. For example, Keycloak does that, but you can remove it. Auth0 doesn't, as far as I remember.

We won't need the ValidateAudience fetch, since the validation logic will all be kept contained in the audienceIsValid function.

NickMRamirez commented 3 years ago

This function seems to work for setting the variables.

local function setVariablesFromPayload(txn, decodedPayload)
    for key, value in pairs(decodedPayload) do
        txn:set_var("txn.oauth." .. key, dump(value))
    end
end

Then you can access / log the variables like this in the haproxy.cfg: var(txn.oauth.aud)

For my token, it captures these fields:

iss:   http://localhost/auth/realms/weather-services
iat:   1629516895
azp:   acme-corp
scope: bronze
jti:   3ef984ac-4daf-4756-b2cb-1a3d65f3a20a
clientAddress:   172.25.0.11
clientId:   acme-corp
clientHost:   172.25.0.11
aud:   { [1] = http://localhost/api/payment-services,[2] = http://localhost/api/weather-services,} 
acr:   1
exp:   1629517195
sub:   c2d10329-416b-4bd9-9c5c-c84f305ccb58

If this works for you, I will commit it.

pfzetto commented 3 years ago

@NickMRamirez

Do you mean, change the code to match the specification so that it adds the client_id to the audience?

Sorry for not being clear about what i meant. The aud Claim is either a case sensitive Array or an case sensitive String. Technically a Audience like foo,bar could be possible. So I would change receivedAudiences = core.tokenize(token.payloaddecoded.aud, ",") to

receivedAudiences ={}
receivedAudiences[0] = token.payloaddecoded.aud

This way the code matches the specification of the AUD Claim.

Then you can access / log the variables like this in the haproxy.cfg: var(txn.oauth.aud)

How would you use that variable in an acl? I don't know a acl method like for arrays like contains. Because of that I exposed the validateAudience Function to HaProxy ACLs with the Fetch.

NickMRamirez commented 3 years ago

Bear with me as I think this through...

If I understand you correctly, you are asking whether it is incorrect to treat a comma as a delimiter when it appears in a single string audience value, since, theoretically, a token could send an 'aud' that is a string, contains a comma, but is meant to be one value. The specification that you linked to is for OpenID Connect, but this Lua library is for OAuth 2.0 only (so far). So, I looked at this OAuth 2.0 document to learn what it says about how to treat audiences.

It says that the audience needs to be an absolute URI that refers to the resource server.

The question is, can an absolute URI contain a comma, and still be considered a single value? According to RFC3986, the answer is yes. A comma is a reserved character that can be used as a delimiter.

So, I think you are correct. I should not tokenize a comma-delimited string, but instead should leave it as a single string. Your way seems better:

receivedAudiences ={}
receivedAudiences[0] = token.payloaddecoded.aud

I will update my code.

As for your second question...

How would you use that variable in an acl?

Help me to understand what you want to do in the ACL, because it seems like the audienceIsValid function handles all validation. What is left to do in an ACL?

Also, if you need to parse the 'aud' variable, you would need to treat it as a string. So, you can use any of the string matching operators in HAProxy. The string will look like this:

{ [1] = http://localhost/api/payment-services,[2] = http://localhost/api/weather-services,} 

Do you prefer a different format for a table that is converted to a string?

pfzetto commented 3 years ago

Help me to understand what you want to do in the ACL, because it seems like the audienceIsValid function handles all validation. What is left to do in an ACL?

I use one HaProy HA Cluster to handle all my API Routing. So it think it would be nice if it would be possible to restrict api access as far as possible prior to the api endpoints. My Config would look something like this (using fetch):

acl api1 hdr(host) -i api1.example.com
acl api1 lua.validateAudience("api1.example.com") -m bool
acl api1 var(txn.oauth_scopes) -m sub api1.read
use_backend api1 if api1

acl api2 hdr(host) -i api2.example.com
acl api2 lua.validateAudience("api2.example.com") -m bool
acl api2 var(txn.oauth_scopes) -m sub api2.read
use_backend api2 if api2

...

I liked the Fetch function because it validates the absolute values and leaves no error for false positives. For example in a public auth envoirement. But I guess if the URLs are surrounded by a character that can't be used in urls it would also work. ~https://api1.example.com~https://api2.example.com~ Personally I like the Fetch more but both would work.

NickMRamirez commented 3 years ago

You can use the variable I will create.

use_backend weatherservice if { var(txn.oauth.aud) -m sub http://localhost/api/weather-services ) }

I have also read that some people use scopes alone to restrict access. Then you could make 'aud' generic: http://localhost/api/.

NickMRamirez commented 3 years ago

Hmmm....maybe that won't work because if the received audience has two audiences, then the first ACL will always win (it will always go to weatherservice). Then again the fetch would have the same problem. You would need to check the host header, like in your example, to be sure where they are trying to go.

In the end, I think that the host header would tell you where they are trying to go, but the 'aud' doesn't add much value as a way to restrict. The token either contains an allowed audience or it doesn't. But it isn't exclusive if there are multiple values. It doesn't tell you any more by calling a fetch or adding it as a variable check. The only way to be exclusive about an audience and restrict access by it would be to issue two different tokens. That is, if the token contains multiple audiences.

Scopes would further refine the access level though.

pfzetto commented 3 years ago

I would See the scope and audience validation as a way to preauthenticate requests.

My only concern with the Variable solution is that there are false positives like var(txn.oauth.aud) -m http://example.com, that would also be true for http://example.com.malicious.example

Using aud in ACL would be perfect when having multiple apis over the same haproxy instance. So you would add the multiple audiences to the global filter and the filter again for the different backends.

NickMRamirez commented 3 years ago

I don't think there's much chance of a malicious audience, since the token will be coming from your trusted token issuer server, which will be verified by checking the cryptographic signature on the token. The signature is the linchpin in determining "do we trust this token?". After that, we trust all the claims and are just verifying that it's the correct token for the service they're calling.

But tampering with the token is discovered early on by the signature check.

NickMRamirez commented 3 years ago

Here are my changes, if you want to try them to see if they will work for your use case.

https://github.com/haproxytech/haproxy-lua-oauth/pull/20

NickMRamirez commented 3 years ago

Closing because the changes were done in https://github.com/haproxytech/haproxy-lua-oauth/pull/20.