TOSIT-IO / tdp-collection

Ansible collection to deploy the components of TDP
Apache License 2.0
21 stars 19 forks source link

Integrate Kafka with Ranger #340

Closed Nuttymoon closed 2 years ago

Nuttymoon commented 2 years ago

The Kafka role does not yet include the deployment of the Ranger-Kafka plugin. The integration of Kafka 3.x with TDP's Ranger presents multiple challenges.

Challenges

Ranger with Kafka 3.x

The Ranger-Kafka plugin is using kafka.security.auth.Authorizer to implement the RangerKafkaAuthorizer. The package kafka.security.auth (Scala) has been deprecated in favor of org.apache.kafka.server.authorizer (Java) in Kafka 2.5 (see KIP-504 and Notable changes in 2.5.0) and totally removed in Kafka 3.0 (see Notable changes in 3.0.0).

The removal of this package makes Ranger 2.0.1 (used by TDP) and even Ranger 2.2.0 (latest official release) incompatible with Kafka 3.x.

Ranger plugin patching

My first approach to solving the issue above was to try to backport the current version of the Kafka plugin on the master branch of Ranger to TDP's Ranger. Indeed, on this branch, RangerKafkaAuthorizer uses the new org.apache.kafka.server.authorizer Java package.

I did not manage to identify all the needed commits and to end up with a clean code for the plugin.

Ranger-Kafka 3.0 plugin build

My second approach was to build the Kafka plugin from the latest commit on the master branch and deploy this plugin on the Kafka cluster.

I managed to build the plugin, run the tests without error, and deploy it. I will publish a branch ranger-3.0-TDP to TDP's Ranger repo with the code used and a branch on this repo with the Ansible playbooks.

Note: Looking back, I think that we should probably not go for this solution as it is based on the master branch and not on a release branch, and Ranger has not yet published any release in 3.x.

Functionnal Kafka with the Ranger 3.0 plugin

Even if I managed to build and deploy the Ranger-Kafka plugin in version 3.0, requests on the Kafka cluster after plugin activation fail with a timeout.

Possible solutions

I see only 2 solutions to this:

Nuttymoon commented 2 years ago

@Edouard-R @leopaul36 @rpignolet @gboutry @mehdibn @nschung what are your thoughts on this?

Nuttymoon commented 2 years ago

Branches with my progress of Kafka-Ranger integration:

rpignolet commented 2 years ago

I think we should stay with Ranger 2.X for the first version of TDP if there is no official release of Ranger 3.X. If Kafka 3.X can not work with Ranger 2.X, downgrade it to 2.X or we do not add Kafka to TDP for the first version and wait a futur version of TDP.

Edouard-R commented 2 years ago

I agree with Romain, let's try to downgrade Kafka to 2.8.1.

leopaul36 commented 2 years ago

I also agree with downgrading Kafka for now as Ranger is a cornerstone of the security configurations of TDP. We'll probably bump the Kafka version once Ranger releases a clean version of the plugin supporting Kafka 3.x.

Nuttymoon commented 2 years ago

Closing this issue as Kafka is moved to tdp-collection-extras.