strimzi / strimzi-kafka-oauth

OAuth2 support for Apache Kafka® to work with many OAuth2 authorization servers
Apache License 2.0
145 stars 90 forks source link

Unable to run example on authorization using KeyCloak #100

Open ttben opened 3 years ago

ttben commented 3 years ago

Hi there!

First, thank you for your detailed and documented examples, that helped me a lot to understand the different topics (pun intended).

I want to both authenticate my client, and handle the authorization using keycloak. I followed your great examples but failed to make the second one run.

I was able to reproduce, locally, this example presenting authentication.

But, I can not reproduce this example presenting authorization.

First, running this:

Let's start up all the containers with authorization configured, and we'll then perform any manual step, and explain how everything works. docker-compose -f compose.yml -f keycloak/compose.yml -f keycloak-import/compose.yml \ -f kafka-oauth-strimzi/compose-authz.yml up --build

leads to:

kafka | [2021-03-24 17:00:23,262] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) kafka | org.apache.kafka.common.KafkaException: org.apache.kafka.common.config.ConfigException: Invalid value javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed for configuration A client SSLEngine created with the provided settings can't connect to a server SSLEngine created with those settings.

and eventually to kafka exited with code 1.

But, I managed to start everything using different terminals. Moving on..!

At the step where we want to produce a message in a topic, expecting it to fail as the client used can not write to this topic, I got the following in a never-ending loop:

[2021-03-24 17:08:22,559] WARN [Producer clientId=console-producer] Bootstrap broker kafka:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2021-03-24 17:08:22,987] ERROR [Producer clientId=console-producer] Connection to node -1 (kafka/172.18.0.4:9092) failed authentication due to: Authentication failed due to an invalid token: io.strimzi.kafka.oauth.validator.TokenValidationException: Token validation failed: Unknown signing key (kid:62f9mblAMSKdfhV8P8x7K1A71zNVrcbfY0y1kNTKx9A) (org.apache.kafka.clients.NetworkClient)

Which is not the fail we expected!... right?

I also tried to start the everything with SSL enabled but was not sure of how to proceed then.

Can you help me on that? :)

Thanks!

scholzj commented 3 years ago

@mstruk Can you help?

mstruk commented 3 years ago

Looks like the example is broken. The REPLICATION listener's truststore certificates issue. Probably the certificates have expired. I'll look at it to find a fix.

mstruk commented 3 years ago

[2021-03-24 17:08:22,559] WARN [Producer clientId=console-producer] Bootstrap broker kafka:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2021-03-24 17:08:22,987] ERROR [Producer clientId=console-producer] Connection to node -1 (kafka/172.18.0.4:9092) failed authentication due to: Authentication failed due to an invalid token: io.strimzi.kafka.oauth.validator.TokenValidationException: Token validation failed: Unknown signing key (kid:62f9mblAMSKdfhV8P8x7K1A71zNVrcbfY0y1kNTKx9A) (org.apache.kafka.clients.NetworkClient)

This happens if you restart your Keycloak, that runs in transient mode, which makes it generate a fresh new set of JWT signing keys, and exposing them through JWKS endpoint.

The client may have obtained a new access token, but the Kafka broker has not yet refreshed the public keys from JWKS endpoint resulting in a mismatch. The Kafka Broker will automatically refresh JWT keys if it encounters an unknown kid, and the problem will self-correct in this case, you may just need to repeat your request a few times.

It can also happen the other way around. Your existing client may still use the refresh token or the access token issued by the previous Keycloak instance while the Kafka broker has already refreshed the keys from JWKS endpoint - resulting in a mismatch between the private key used by Keycloak to sign the token, and the published public keys (JWKS endpoint). Since the problem is on the client you may need to configure your client with a newly obtained refresh token, or access token. If you configure your client with clientId and secret, it should auto-correct itself, you just need to restart it.

ttben commented 3 years ago

Thank you for your answer and your explanations!

There something I don't get though. When running these commands which come from the example:

docker run -ti --rm --name kafka-cli --network docker_default strimzi/example-kafka /bin/sh

cat > ~/team-a-client.properties << EOF
security.protocol=SASL_PLAINTEXT
sasl.mechanism=OAUTHBEARER
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  oauth.client.id="team-a-client" \
  oauth.client.secret="team-a-client-secret" \
  oauth.token.endpoint.uri="http://keycloak:8080/auth/realms/kafka-authz/protocol/openid-connect/token" ;
sasl.login.callback.handler.class=io.strimzi.kafka.oauth.client.JaasClientOauthLoginCallbackHandler
EOF

export CLASSPATH=/opt/kafka/libs/strimzi/*:$CLASSPATH

bin/kafka-console-producer.sh --broker-list kafka:9092 --topic my-topic \
  --producer.config=$HOME/team-a-client.properties
First message

Kafka logs the following:

kafka        | [2021-03-29 06:44:34,407] TRACE Response body for GET http://keycloak:8080/auth/realms/demo/protocol/openid-connect/certs: {"keys":[{"kid":"ghCxjjufxYpYtw4kJgaPRhYsDrFwDZGAvlgyQ4zgvEY","kty":"RSA","alg":"RS256","use": (.....)

Note that the paths points to demo realm, whereas the producer used as config the following

  oauth.token.endpoint.uri="http://keycloak:8080/auth/realms/kafka-authz/protocol/openid-connect/token" ;

(Note that the realm is different)

How a client specified that the realm to use to authenticate is kafka-authz but the kafka broker will check against demo.

As the signing key exists on kafak_authz and not demo (or the other way around?) I think this causes the issue.

I tried to specify the realm to use (export REALM=kafka-authz) before starting the kafka broker, but this resulted is kafka exited with an error code.


Regarding your kind answer now:

This happens if you restart your Keycloak, that runs in transient mode, which makes it generate a fresh new set of JWT signing keys, and exposing them through JWKS endpoint.

Hm, I understand this but never restarted KC... I literally run this example and encountered this issue.

The client may have obtained a new access token, but the Kafka broker has not yet refreshed the public keys from JWKS endpoint resulting in a mismatch. The Kafka Broker will automatically refresh JWT keys if it encounters an unknown kid, and the problem will self-correct in this case, you may just need to repeat your request a few times.

This refresh will target a specific realm right? So if it is using the demo realm JWKS endpoint to fetch the public keys, to check a key from a client that was supposed to be in another realm, this will fail right?

If you configure your client with clientId and secret, it should auto-correct itself, you just need to restart it.

As I described, I used the kafka command line client, specified in your example which, as far as I understand, use only secret/id, and the problem seems not to autocorrect itself :/

Thank again!

mstruk commented 3 years ago

This refresh will target a specific realm right? So if it is using the demo realm JWKS endpoint to fetch the public keys, to check a key from a client that was supposed to be in another realm, this will fail right?

Correct.

The log that you mention is the result of a background job periodically refreshing the JWT token signing keys from the JWKS endpoint configured in listener configuration. The fact that it goes to demo realm means you're not really using compose-authz.yml file to start up your server, or if you do you maybe have REALM env var set to demo in the Terminal window that you use to run docker-compose. Make sure that your docker-compose really starts a clean new instance of Kafka broker, rather than continues execution of a saved previous instance.

To clean the previous one: docker rm kafka zookeeper keycloak

With regards to Unknown signing key, you need to be careful when switching between examples, because they use different Keycloak realms. A different realm is like using a different authorization server - a different endpoint, like you already pointed out.

If your client uses kafka-authz realm, then the access token it obtains can not be used to authenticate to the demo realm. Maybe you tried compose.yml example to run the Kafka broker, then continued with the existing client configuration to run connect to the Kafka broker started using compose-authz.yml example. That won't work. You have to change your client config to set oauth.token.endpoint.uri to http://keycloak:8080/auth/realms/kafka-authz/protocol/openid-connect/token. You have to restart your client, and if you use oauth.access.token or oauth.refresh.token config, you have to obtain and set a new token.

Maybe on the server side check your examples/docker/kafka-oauth-strimzi/compose-*.yml file how the JWKS endpoint is configured.

I made some fixes and docs improvements in #102.

ttben commented 3 years ago

Maybe you tried compose.yml example to run the Kafka broker, then continued with the existing client configuration to run connect to the Kafka broker started using compose-authz.yml example

Indeed, mixed with

you maybe have REALM env var set to demo in the Terminal window

My bad!

Thank you for your help and quick answers! :)


EDIT:

Last quick question, as pointed out in my initial comment, any idea why this logs sometimes happen, is that a configuration issue?

kafka | [2021-03-24 17:00:23,262] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) kafka | org.apache.kafka.common.KafkaException: org.apache.kafka.common.config.ConfigException: Invalid value javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed for configuration A client SSLEngine created with the provided settings can't connect to a server SSLEngine created with those settings.

mstruk commented 3 years ago

Maybe you can provide more of the stacktrace where this happens? It looks like a truststore issue. As if connection to REPLICATION listener fails due to untrusted certificate.

It should either succeed all the time or fail all the time. I don't see why or how it would sometimes work, and sometimes not without you changing the configuration in some way.

abdessamadrajad commented 3 years ago

I would like to thank you for the informations provided in this issue. I am about to start this example, and like you @ttben I want to setup authentication and authorization using Keycloak.