uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
917 stars 199 forks source link

federation mode: Failed add new topic (No available worker!) #320

Closed soraka-hu closed 3 years ago

soraka-hu commented 4 years ago

hi: i use federation mode, I copied the first topic successfully (cluster2 --> dest),but i copied second topic have a ERROR (cluster1 --> dest) ` curl -X POST "10.114.25.XX:9015/topics/topic2?src=cluster$dst=dest" ERROR is:

  {"message":"Failed add new topic: topic2 from: cluster1 to: dest due to exception: java.lang.Exceptionn: No available worker!","status":"500"}`

i setting is : manager is : java -Dlog4j.configuration=file:config/tools-log4j.properties -Xms3g -Xmx3g -Xmn512m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC -XX:+PrintCommandLineFlags -XX:CMSInitiatingOccupancyFraction=80 -XX:SurvivorRatio=2 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -Xloggc:/tmp/ureplicator-manager/gc-ureplicator-manager.log -server -cp uReplicator-Manager/target/uReplicator-Manager-1.0.0-SNAPSHOT-jar-with-dependencies.jar com.uber.stream.kafka.mirrormaker.manager.ManagerStarter -config config/clusters.properties -srcClusters cluster1,cluster2 -destClusters cluster3 -enableRebalance false -zookeeper zk1,zk2,zk3/ureplicator/testing-dc1 -managerPort <port> -deployment testing-dc1 -env dc1.testing-dc1 -instanceId <id> -controllerPort <port> -graphiteHost 127.0.0.1 -graphitePort 4756 -metricsPrefix ureplicator-manager -workloadRefreshPeriodInSeconds 300 -initMaxNumPartitionsPerRoute 1500 -maxNumPartitionsPerRoute 2000 -initMaxNumWorkersPerRoute 10 -maxNumWorkersPerRoute 80

Controller: java -Dlog4j.configuration=file:config/tools-log4j.properties -Xms3g -Xmx3g -Xmn512m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC -XX:+PrintCommandLineFlags -XX:CMSInitiatingOccupancyFraction=80 -XX:SurvivorRatio=2 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -Xloggc:/tmp/ureplicator-controller/gc-ureplicator-controller.log -server -cp uReplicator-Controller/target/uReplicator-Controller-1.0.0-SNAPSHOT-jar-with-dependencies.jar com.uber.stream.kafka.mirrormaker.controller.ControllerStarter -config config/clusters.properties -srcClusters cluster1,cluster2 -destClusters cluster3 -enableFederated true -deploymentName testing-dc1 -mode customized -zookeeper zk1,zk2,zk3/ureplicator/testing-dc1 -port <port> -env dc1.testing-dc1 -instanceId <id> -hostname <hostname> -graphiteHost 127.0.0.1 -graphitePort 4756 -metricsPrefix ureplicator-controller -enableAutoWhitelist false -enableAutoTopicExpansion true -autoRebalanceDelayInSeconds 120 -autoRebalancePeriodInSeconds 120 -autoRebalanceMinIntervalInSeconds 600 -autoRebalanceMinLagTimeInSeconds 900 -autoRebalanceMinLagOffset 100000 -autoRebalanceMaxOffsetInfoValidInSeconds 1800 -autoRebalanceWorkloadRatioThreshold 1.5 -maxDedicatedLaggingInstancesRatio 0.2 -maxStuckPartitionMovements 3 -moveStuckPartitionAfterMinutes 20 -workloadRefreshPeriodInSeconds 300 enableSrcKafkaValidation true -maxWorkingInstances 0 -autoRebalanceDelayInSeconds 120 -refreshTimeInSeconds 600 -initWaitTimeInSeconds 120 -numOffsetThread 10 -blockingQueueSize 30000 -offsetRefreshIntervalInSec 300 -backUpToGit false -localBackupFilePath /tmp/ureplicator-controller -localGitRepoClonePath /ureplicator-controller-bkp worker: Java -Dlog4j.configuration=file:config/tools-log4j.properties -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=45 -verbose:gc -Xmx5g -Xms5g -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -server -javaagent:./bin/libs/jmxtrans-agent-1.2.4.jar=config/jmxtrans.xml -cp uReplicator-Worker/target/uReplicator-Worker-1.0.0-SNAPSHOT-jar-with-dependencies.jar kafka.mirrormaker.MirrorMakerWorker --cluster.config config/clusters.properties --consumer.config config/consumer.properties --producer.config config/producer.properties --helix.config config/helix.properties --dstzk.config config/dstzk.propertiess

@yangy0000 @xhl1988 I look forward to your reply

xhl1988 commented 4 years ago

@soraka-hu how many workers do you have in the worker deployment?

The route for first topic is cluster2 --> dest; the route for second topic is cluster1 --> dest which is a new route and requires extra workers. Each route will have a min worker requirement. You can add more workers and try it again.

soraka-hu commented 4 years ago

Thank you for your reply.

I have 3 workers in my deployment. “Each route will have a min worker requirement ” . Is there any parameter to control it? Which one?

xhl1988 commented 4 years ago

Sorry it's hard coded: https://github.com/uber/uReplicator/blob/master/uReplicator-Manager/src/main/java/com/uber/stream/kafka/mirrormaker/manager/core/ControllerHelixManager.java#L115

Feel free to make it configurable if you have time.

soraka-hu commented 4 years ago

Thank you again for your reply. Is there no function to automatically allocate workers? i think this method is rigid and unreasonable

xhl1988 commented 4 years ago

Could you please elaborate your question a bit more? We hard coded the batch size to 5 as it makes sense in production environment. Making it configurable is definitely more flexible.

soraka-hu commented 4 years ago

Thank you again for your reply. ok,i am understand

soraka-hu commented 3 years ago
        Hi,

   I already online ureplicator federation model
  but this have a ERROR log

                            913337533

                                邮箱:913337533@qq.com

    Signature is customized by Netease Mail Master

        On 11/20/2020 10:38, Hongliang Xu wrote: 

Could you please elaborate your question a bit more? We hard coded the batch size to 5 as it makes sense in production environment. Making it configurable is definitely more flexible.

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe. [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/uber/uReplicator/issues/320#issuecomment-730807902", "url": "https://github.com/uber/uReplicator/issues/320#issuecomment-730807902", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]