uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
917 stars 199 forks source link

Autowhitelist in Federated mode does not start automatically #339

Open disserakt opened 3 years ago

disserakt commented 3 years ago

Hi to all! I use master branch to run Manager, Controller, Worker by Federation Mode in docker, and they are running and replicated data, but there are a few things I do not understand. I set config for controller with -enableAutoWhitelist=true and -enableAutoTopicExpansion=true.

1 moment - when i run uReplicator Federated in first time, controller do not add topics with the same names in src and dst cluster to autowhitelist. And data do not replicated automaticle. But when i add one topic to whitelist by POST to REST API (curl -X POST "http://manager:8100/topics/ureplicator-fed?src=cluster1&dst=cluster2") - all topics with the same names in src and dst cluster will be start replicated. By the way, when I removed this topic ureplicator-fed from whitelist - autowhitelist works fine, new topics automaticle adds to them and process replicated data.

How is it possible to start autowhitelist work and replicated data right after start uReplicator Federated without this hint?

2 moment - when I delete topic in dst cluster (I use standart command: ./kafka-topics --zookeeper federation-zk01:2181 --delete --topic ureplicator-fed) topic will be deleted in this cluster. But when I delete this topic in src cluster - he recreated, because in this Kafka cluster set config auto.create.topics.enable=true. And perhaps it happened because consumers recreated it. When I look in autowhitelist topics in controller by REST API (curl http://controller:9000/topics) - it had this deleted topic in autowhitelist. By the way, autowhitelist topics in manager (curl http://manager:8100/topics) - was empty at the same time.

How can i delete topic from this autowhitelist, to make it possible to delete topics from both Kafka clusters, src and dst without recreating?

3 moment - in maneger logs I see following error: ERROR Validate WRONG: hostInfo: controller:9000, InstanceId: controller-fed, route: [{topic: @cluster1@cluster2, partition: 0, pipeline: null}], topic only in controller: [ureplicator-fed-psi, ureplicator-fed-dev] (com.uber.stream.kafka.mirrormaker.manager.core.ControllerHelixManager)

How can I fix it, if it needs fixing of course? Replicating works fine when it occurs.

Manager container start args:

docker run --name manager --restart unless-stopped --network ureplicator-net \
-d -it -p 8100:8100 \
ryuananev/ureplicator manager \
-config config/fed-staging-clusters.properties \
-srcClusters cluster1 \
-destClusters cluster2 \
-deployment ureplicator-fed-staging \
-env staging.ureplicator-fed-staging \
-enableRebalance true \
-zookeeper federation-zk-staging:2181/ureplicator-fed-staging \
-managerPort 8100 \
-instanceId manager-fed \
-graphiteHost graphite-test \
-graphitePort 2003 \
-metricsPrefix ureplicator-manager

Manager java runtime args:

java -Dlog4j.configuration=file:/opt/ureplicator/config/tools-log4j.properties -Xms512m -Xmx512m -server -cp /opt/ureplicator/ureplicator-manager/target/uReplicator-Manager-2.0.1-SNAPSHOT-jar-with-dependencies.jar:/opt/ureplicator/libs/* com.uber.stream.kafka.mirrormaker.manager.ManagerStarter -config /opt/ureplicator/config/fed-staging-clusters.properties -srcClusters cluster1 -destClusters cluster2 -deployment ureplicator-fed-staging -env staging.ureplicator-fed-staging -enableRebalance true -zookeeper federation-zk-staging:2181/ureplicator-fed-staging -managerPort 8100 -instanceId manager-fed -graphiteHost graphite-test -graphitePort 2003 -metricsPrefix ureplicator-manager

Controller container start args:

docker run --name controller --restart unless-stopped --network ureplicator-net \
-d -it -p 9000:9000 \
ryuananev/ureplicator controller \
-config config/fed-staging-clusters.properties \
-enableFederated true \
-deploymentName ureplicator-fed-staging \
-srcClusters cluster1 \
-destClusters cluster2 \
-mode customized \
-zookeeper federation-zk-staging:2181/ureplicator-fed-staging \
-port 9000 \
-hostname controller \
-helixClusterName ureplicator-fed-staging \
-env staging.ureplicator-fed-staging \
-groupId ureplicator-fed-staging \
-offsetRefreshIntervalInSec 5 \
-autoRebalancePeriodInSeconds 10 \
-refreshTimeInSeconds 10 \
-enableAutoWhitelist true \
-enableAutoTopicExpansion true \
-patternToExcludeTopics '^__.*' \
-graphiteHost graphite-test \
-graphitePort 2003 \
-metricsPrefix ureplicator-controller \
-instanceId controller-fed

Controller java runtime args:

java -Dlog4j.configuration=file:/opt/ureplicator/config/tools-log4j.properties -Xms512m -Xmx512m -server -cp /opt/ureplicator/ureplicator-controller/target/uReplicator-Controller-2.0.1-SNAPSHOT-jar-with-dependencies.jar:/opt/ureplicator/libs/* com.uber.stream.kafka.mirrormaker.controller.ControllerStarter -config /opt/ureplicator/config/fed-staging-clusters.properties -enableFederated true -deploymentName ureplicator-fed-staging -srcClusters cluster1 -destClusters cluster2 -mode customized -zookeeper federation-zk-staging:2181/ureplicator-fed-staging -port 9000 -hostname controller -helixClusterName ureplicator-fed-staging -env staging.ureplicator-databusfed-staging -groupId ureplicator-fed-staging -offsetRefreshIntervalInSec 5 -autoRebalancePeriodInSeconds 10 -refreshTimeInSeconds 10 -enableAutoWhitelist true -enableAutoTopicExpansion true -patternToExcludeTopics ^__.* -graphiteHost graphite-test -graphitePort 2003 -metricsPrefix ureplicator-controller -instanceId controller-fed

I would be glad to hear any advice =)

yangy0000 commented 3 years ago

Hello, sorry for late response, I overlooked this question. for the No1/ yes , you always need to enable whitelist for at least one topic to enable auto whitelist No2/ no, for uReplicator with auto whitelist mode, you can't delete the topic. but if the topic doesn't exists in the destination cluster, uReplicator will not perform data replication No3/ apparently this is a false alarm. I will fix this.

disserakt commented 3 years ago

@yangy0000 - hank you very much for such a detailed answer =) And I am very interested in one more question - сan a uReplicator connect to a zookeeper using the SSL protocol and log in to it using the SASL method? I.e. - I am interested in authorization of the uReplicator itself with a Zookeeper.

The uReplicator can connect to Kafka via SSL and SASL, but for the uReplicator to work, it also needs to connect to the zookeeper with which the Kafka cluster works (this is controlled by the parameter kafka.cluster.zkStr.cluster1=zk1,zk2/cluster1 in config file for cluster's list by manager and controller). And if this zookeeper works with the Kafka cluster via the SSL and SASL - can the uReplicator also connect to it via the SSL and SASL - I unfortunately did not find such a configuration example

As I understand it, this cannot be done at the moment - and is it possible to add this capability for the replicator in some way?