aerogear / keycloak-metrics-spi

Adds a Metrics Endpoint to Keycloak
Apache License 2.0
549 stars 156 forks source link

Add option to require bearer token authentication on the endpoint #138

Open sirkrypt0 opened 2 years ago

sirkrypt0 commented 2 years ago

Motivation

The issues #39 and #119 regard authentication on the metrics endpoint. While the configuration using the path-prefix WildFly configurations worked okayish, the new Quarkus distribution does not provide such a feature (as far as I know) leading to unprotected endpoints.

What

Since Prometheus supports authentication using bearer token obtained via the OAuth 2 client credentials grant (see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2), this PR adds exactly this protection option.

By enabling the bearerEnabled setting, authentication on the metrics endpoint using valid Bearer tokens can now be enforced. A client requesting the metrics endpoint must set the Authorization: Bearer header with a valid token obtained from Keycloak. The token must originate from the realm configured by the realm setting (defaults to master) and must have the role configured in the role setting (defaults to prometheus-metrics).

Furthermore, the second commit enforces that any of the authentication options is enabled, or, even though not recommended, authentication is explicitly disabled. Note that this is likely a breaking change.

Why

The new Quarkus distribution is not an application server anymore and hence does not support protecting routes using the WildFly configuration. Moreover, while this option worked okayish (some configurations still allowed access using workarounds such as URL encoding the word metrics), actual authentication using bearer tokens provides a more robust, integrated protection.

Furthermore, an authentication setting is enforced by default, as this should prevent users from accidentally exposing the metrics endpoint to the public.

How

I added 4 new configuration options: disableAuthentication, bearerEnabled, realm, and role.

If bearer authentication is enabled, we use the integrated AuthResult from Keycloak and validate whether the requesting user (which can also be a client service account) is part of the required role.

Verification Steps

I described the necessary configuration steps in the README in the Bearer Authentication section. If those instructions are unclear or do not suffice, I'm happy to extend the guide.

You can also use a Bash script like the following to test the interaction manually without a running Prometheus instance (note that is requires jq to be installed).

#!/bin/bash

BASE="https://localhost:8443"
REALM="master"
CLIENT_ID="prometheus"
CLIENT_SECRET="<client-secret>"

token=$(curl --insecure -q -X POST "$BASE/auth/realms/$REALM/protocol/openid-connect/token" \
 -d 'grant_type=client_credentials' \
 -d "client_id=$CLIENT_ID" -d "client_secret=$CLIENT_SECRET" \
 | jq -r .access_token)

curl --insecure -vv -H "Authorization: Bearer $token" \
  "$BASE/auth/realms/$REALM/metrics"

Checklist:

sirkrypt0 commented 1 year ago

@pb82 I kindly wanted to ask if this is of interest and whether I could do something to get this merged :)

pb82 commented 1 year ago

@sirkrypt0 Hey, sorry for the late reply. This looks really good and I believe is an important feature to add. I'll try to follow your verification steps. Have you tested this against the latest release of Keycloak?

sirkrypt0 commented 1 year ago

@pb82 Thanks for taking the time :) This runs with the latest release (20.0.2) on my instance and I did not notice any issues.

francescjp commented 1 year ago

Hello, we're interested in this option too.

sirkrypt0 commented 1 year ago

I rebased to get the latest changes from the 3.0.0 release and tested it with Keycloak 21.1.1

sirkrypt0 commented 1 year ago

I just rebased to get the changes from the 4.0.0 release. Also tested it with Keycloak 22.0.1.

pb82 commented 1 year ago

@sirkrypt0 I see that authenticationDisable is set to false by default. Will this change the default behaviour after we merge this?

sirkrypt0 commented 1 year ago

@sirkrypt0 I see that authenticationDisable is set to false by default. Will this change the default behaviour after we merge this?

Yes, this would likely be a breaking change, but would avoid that users expose their metrics endpoint accidentally (which I have seen a couple in the past).

However, if you think that this is a bad idea, I can also change it such that the current default behaviour is kept and no breaking change is introduced :)

sirkrypt0 commented 1 year ago

Additionally, if we introduce this as a breaking change, we could also think about renaming the DISABLE_EXTERNAL_ACCESS variable (I currently kept it as is to be at least compatible with that setting). I think that the current name is misleading/too broad, as it doesn't really disable external access per se, but rather checks that a certain HTTP header is not present. Hence, a name like CHECK_X_FORWARDED_HEADER or some other name might be more appropriate.

pb82 commented 1 year ago

@sirkrypt0 I would prefer to not introduce a breaking change, but what you said also makes sense. Maybe let's do the following:

  1. Update this PR to not introduce a breaking change and merge it
  2. Create a new issue about the insecure default and get a few more opinions and then eventually update the defaults.

What do you think?

sirkrypt0 commented 1 year ago

Sounds reasonable to me. I'll update the PR.