uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
906 stars 200 forks source link

ERROR Failed to get topics from kafka zk: {} (com.uber.stream.kafka.mirrormaker.common.core.KafkaBrokerTopicObserver) #348

Open amohammed3 opened 2 years ago

amohammed3 commented 2 years ago

Hi there,

I am doing a little poc to learn more about uReplicator and how it replicates the topics and messages

I am new to kafka and uber uReplicator, i am trying to deploy federated uReplicator. I am not able to configure it properly i guess. There are certain things which aren't making sense. 1) i have cloned all the git repos (master, 0.1.0, 1.0 ) but after i use mvn clean package, the jar files its creating have some different versions than i see in the quick start guide commands, especially the start commands for federated one. 2) if i am correct, in the federated mode we can have more than one src clusters and more than one dest clusters right? 3) i have learned that one uReplicator can replicate between one src and one dest cluster. But this is for non-federated. In the federated mode we can have multiple src clusters and multiple target clusters. Is that correct? 4) If i am correct, uReplicator does not replicate topic configurations unlike mm2 from kafka, it does not replicate topic offsets?

In my case, i want to start uReplicator in federated mode so that i can add more clusters to it later, i want all the topics to replicate from source machine (single node kafka&zk) to target (single node kafka&zk). I started federated uReplicator on my target node. So i am starting manager, controller and worker on the same machine (target), so the machine i am starting uRep-federated is the destination/target also for replication, but i am seeing this error when i start manager. image

The manager is running, its not stopping but throwing this error too and when i create a topic and produce some messages in the source machine, i am not able to consume those messages from the target machine.

Here are my logs:

manager: image

controller: image

Worker: image

Here are my configuration files: clusters.properties image

consumer.properties: image

producer.properties: image

helix.properties: image

my start commands:

manager: java -Dlog4j.configuration=file:config/tools-log4j.properties -Xms3g -Xmx3g -Xmn512m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC -XX:+PrintCommandLineFlags -XX:CMSInitiatingOccupancyFraction=80 -XX:SurvivorRatio=2 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -Xloggc:./logs/gc-ureplicator-manager.log -server -cp uReplicator-Manager/target/uReplicator-Manager-2.0.2-SNAPSHOT-jar-with-dependencies.jar com.uber.stream.kafka.mirrormaker.manager.ManagerStarter -config config/clusters.properties -srcClusters cluster1 -destClusters cluster3 -enableRebalance false -zookeeper localhost:2181 -managerPort 9000 -deployment c1-c2 -env dc1.c1-c2 -instanceId 100 -graphiteHost 127.0.0.1 -graphitePort 4756 -workloadRefreshPeriodInSeconds 300 -initMaxNumPartitionsPerRoute 1500 -maxNumPartitionsPerRoute 2000 -initMaxNumWorkersPerRoute 10 -maxNumWorkersPerRoute 80

controller: java -Dlog4j.configuration=file:config/tools-log4j.properties -Xms3g -Xmx3g -Xmn512m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC -XX:+PrintCommandLineFlags -XX:CMSInitiatingOccupancyFraction=80 -XX:SurvivorRatio=2 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -Xloggc:./logs/gc-ureplicator-controller.log -server -cp uReplicator-Controller/target/uReplicator-Controller-2.0.2-SNAPSHOT-jar-with-dependencies.jar com.uber.stream.kafka.mirrormaker.controller.ControllerStarter -config config/clusters.properties -srcClusters cluster1 -destClusters cluster3 -enableFederated true -deploymentName c1-c2 -env dc1.c1-c2 -mode customized -zookeeper localhost:2181 -port 9100 -instanceId 1 -hostname swarm1 -enableAutoWhitelist true -enableAutoTopicExpansion true -autoRebalanceDelayInSeconds 120 -autoRebalancePeriodInSeconds 120 -autoRebalanceMinIntervalInSeconds 600 -autoRebalanceMinLagTimeInSeconds 900 -autoRebalanceMinLagOffset 100000 -autoRebalanceMaxOffsetInfoValidInSeconds 1800 -autoRebalanceWorkloadRatioThreshold 1.5 -maxDedicatedLaggingInstancesRatio 0.2 -maxStuckPartitionMovements 3 -moveStuckPartitionAfterMinutes 20 -workloadRefreshPeriodInSeconds 300 -patternToExcludeTopics ^__.* -enableSrcKafkaValidation true -consumerCommitZkPath "" -maxWorkingInstances 0 -autoRebalanceDelayInSeconds 120 -refreshTimeInSeconds 600 -initWaitTimeInSeconds 120 -numOffsetThread 10 -blockingQueueSize 30000 -offsetRefreshIntervalInSec 300 -backUpToGit false -localBackupFilePath ./logs/ureplicator-controller -localGitRepoClonePath ./logs/ureplicator-controller-bkp

worker: java -Dlog4j.configuration=file:config/test-log4j.properties -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=45 -verbose:gc -Xmx1g -Xms1g -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:gc-ureplicator-worker.log -server -cp uReplicator-Worker/target/uReplicator-Worker-2.0.2-SNAPSHOT-jar-with-dependencies.jar com.uber.stream.ureplicator.worker.WorkerStarter -federated_enabled true -cluster_config config/clusters.properties -consumer_config config/consumer.properties -producer_config config/producer.properties -helix_config config/helix.properties

I am really sorry for my novice questions. But i would really appreciate if someone can explain me what's wrong with my setup.

regards, Abdul Hai Mohammed

yangy0000 commented 2 years ago

The configuration looks correct to me, can you verify whether config/clusters.properties contains zkStr/boostrap server config for both src and dst cluster? Alao can you try to pull the latest master and use the examples in the user guide to start uReplicator?

amohammed3 commented 2 years ago

thank you @yangy0000 for replying. Yes config/clusters.properties file have skStr server config for both src and dst cluster. My dst is localhost and src is another machine which have kafka running. Do i need to start uReplicator on src too?

I have been using master branch. The only where the example command is not working is for starting the worker. firstly, the jar files that are generating using master branch are having different versions like (2.0.2) than what's shown in the example command (1.0.0) but even if i try to use what's generating on my side, it was throwing an error stating something like class not found and when i searched for that class it was not there in the master branch.

after a while i tried by downloading worker from some other branch and starting worker with some other command that i found in one of the issues and i have posted the command i have used above. Though i am able to run the worker now but still topics and messages are not replicated and still getting that error in the manager side that i have posted above.

yangy0000 commented 2 years ago

got it, let me try out the master branch examples first.

yangy0000 commented 2 years ago

I tried the same manager startup command you provided, (only changing cluster3 to cluster 2) and I can't reproduce the "failed get topics from zk" error. a few questions: 1) are you using secured zookeeper clusters? 2) what is your broker/zk version

Sorry I missed your questions i have cloned all the git repos (master, 0.1.0, 1.0 ) but after i use mvn clean package, the jar files its creating have some different versions than i see in the quick start guide commands, especially the start commands for federated one. sorry for that , I will fix it if i am correct, in the federated mode we can have more than one src clusters and more than one dest clusters right? yes i have learned that one uReplicator can replicate between one src and one dest cluster. But this is for non-federated. In the federated mode we can have multiple src clusters and multiple target clusters. Is that correct? yes If i am correct, uReplicator does not replicate topic configurations unlike mm2 from kafka, it does not replicate topic offsets? yes