uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
917 stars 199 forks source link

uber replicator configuration for the test environment. #223

Open Gk8143197049 opened 5 years ago

Gk8143197049 commented 5 years ago

Can I have the example configurations so that we can test the lag. I am trying to evaluate the uber replicator?? I have cloned the master branch. where can I configure the controller starter config and want to know how can I configure the src cluster and target cluster for all components. once I understand. I can also build document for U Replicator.

Please let me know??

xhl1988 commented 5 years ago

By it does not give any value, do you mean the node doesn't exist?

From the log, I didn't see any commit error.

Can you add some logs and redeploy and collect the logs?

  1. After this line, add:

    info("commitOffsetToZooKeeper: %s with %d".format(topicPartition, offset))
  2. After this line, add:

    info("updatePersistentPath: %s with %d".format(topicDirs.consumerOffsetDir + "/" + topicPartition.partition, offset.toString))
  3. Change this line to:

    info("Sending message %s, with value size %d".format(data.message(), data.message().size))
  4. After this line, add:

    info("Got callback")
Gk8143197049 commented 5 years ago

I have attached the error logs and I don't see any errors in the logs and did not see any data pushed to the target cluster.

workerlog3202018.zip

xhl1988 commented 5 years ago

hmm, it seems you didn't commit any message to zookeeper. Can you enable debug log and see if you can find this log? debug("validBytes=%d, sizeInBytes=%d".format(messages.validBytes, messages.sizeInBytes))

Gk8143197049 commented 5 years ago

where can I find the debug( xxxxx). I was using the config/log4j.properties. I have uncommented the following properties:

log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG, kafkaAppender log4j.logger.kafka.client.ClientUtils=DEBUG, kafkaAppender log4j.logger.kafka.perf=DEBUG, kafkaAppender log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender log4j.logger.org.I0Itec.zkclient.ZkClient=DEBUG log4j.logger.kafka=INFO, kafkaAppender

worker logs workerlogs3252019.txt

xhl1988 commented 5 years ago

Can you just enable debug for all? also can you remove gc log from the application log?

Gk8143197049 commented 5 years ago

can I remove the logs using any parameters in the log4.propeties. if yes? can you let me know the parameters? or I have to clean it manually. I will run the controller and worker scripts tomorrow.

I really appreciate your help !! Thanks

xhl1988 commented 5 years ago

You can just set rootLogger as debug and remove package specific logger.

Gk8143197049 commented 5 years ago

the rootloggger is set as debug.Do I need to change anything in tools-log4j.properties or test-log4j.properties. Please specify the package specific logger details . Are these below details in log4j.properties:

log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.kafkaAppender.File=${kafka.logs.dir}/server.log log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.stateChangeAppender.File=${kafka.logs.dir}/state-change.log log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.requestAppender.File=${kafka.logs.dir}/kafka-request.log log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.controllerAppender.File=${kafka.logs.dir}/controller.log log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG, kafkaAppender log4j.logger.kafka.client.ClientUtils=DEBUG, kafkaAppender log4j.logger.kafka.perf=DEBUG, kafkaAppender log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender log4j.logger.org.I0Itec.zkclient.ZkClient=DEBUG log4j.logger.kafka=DEBUG, kafkaAppender log4j.logger.kafka.network.RequestChannel$=DEBUG requestAppender log4j.additivity.kafka.network.RequestChannel$=false log4j.logger.kafka.network.Processor=DEBUG, requestAppender log4j.logger.kafka.server.KafkaApis=DEBUG, requestAppender log4j.additivity.kafka.server.KafkaApis=false log4j.logger.kafka.request.logger=DEBUG, requestAppender log4j.additivity.kafka.request.logger=false log4j.logger.kafka.controller=DEBUG, controllerAppender log4j.additivity.kafka.controller=false log4j.logger.kafka.request.logger=DEBUG, requestAppender log4j.additivity.kafka.log.LogCleaner=false log4j.logger.state.change.logger=DEBUG, stateChangeAppender log4j.additivity.state.change.logger=false

xhl1988 commented 5 years ago

Are you sure your worker is using this properties file? You can specify the file by -Dlog4j.configuration=file:<your file>.

Gk8143197049 commented 5 years ago

I will make the changes accordingly. I was able to run the worker .Can I attach the configuration to the controller script or worker script.

below are the logs. workerlogs3262019.txt

Gk8143197049 commented 5 years ago

I have made changes to the script accordingly. do I need to add the configuration -Dlog4j.configuration=file: to the worker or controller script??

xhl1988 commented 5 years ago

I was able to run the worker, Do you mean the replication is working now?

Yes, you can.

Gk8143197049 commented 5 years ago

No I do not see any data in target cluster. can you check the above log. I wanted where should I add the Dlog4.configuration to the controller or worker script??

xhl1988 commented 5 years ago

You can refer to the example here. Also remove -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution.

Gk8143197049 commented 5 years ago

Do we need javaagent( this is pointing towards the /bin/libs/jmxtrans-agent-1.2.4.jar=config/jmxtrans.xml because my config files does not contain any jmxtrans.xml . we are currently using JVM ( I can point to java virtual machine)

xhl1988 commented 5 years ago

Just use this: java -Dlog4j.configuration=file:config/tools-log4j.properties -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=45 -verbose:gc -Xmx5g -Xms5g -XX:+UseG1GC -server -cp uReplicator-Worker/target/uReplicator-Worker-1.0.0-SNAPSHOT-jar-with-dependencies.jar kafka.mirrormaker.MirrorMakerWorker --consumer.config config/consumer.properties --producer.config config/producer.properties --helix.config config/helix.properties

Gk8143197049 commented 5 years ago

Got you!! I just cloned the master repo. I see the files there. Can I use the older controller script like ./uReplicator-Distribution/target/uReplicator-Distribution-pkg/bin/start-controller.sh -port 9000 -helixClusterName uReplicatorDev -zookeeper xxxxx:2181 -enableAutoWhitelist false -srcKafkaZkPath xxxxxxxxxxx:2181 -destKafkaZkPath xxxxx:2181 -graphiteHost xxxxxxxxxxxxxxxxxx -graphitePort 4756 -metricsPrefix metric_prefix &

or Is there any command for the controller script.

Do let me know. I will run it today.

Thanks

xhl1988 commented 5 years ago

You use the old controller script. Controller part is fine. We just need the debug log from worker.

Please change the config/tools-log4j.properties to emit debug log on your side.

Gk8143197049 commented 5 years ago

I was trying to start the worker script. it gives following errors on config/jmxtrans.xml. what should I give on the name prefix??. let me know what parameters to pass for the jmxtrans.xml . below is the error screenshot:

2019-03-28 08:10:34.191 INFO [main] org.jmxtrans.agent.JmxTransAgent - Starting 'JMX metrics exporter agent: 1.2.4' with configuration 'config/jmxtrans.xml'... 2019-03-28 08:10:34.203 INFO [main] org.jmxtrans.agent.JmxTransAgent - PropertiesLoader: Empty Properties Loader 2019-03-28 08:10:34.472 INFO [main] org.jmxtrans.agent.ExpressionLanguageEngineImpl - Unsupported expression '' 2019-03-28 08:10:34.472 INFO [main] org.jmxtrans.agent.ExpressionLanguageEngineImpl - Unsupported expression '' 2019-03-28 08:10:34.475 INFO [main] org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter - GraphitePlainTextTcpOutputWriter is configured with HostAndPort{host='xxxxx', port=4756}, metricPathPrefix=_unsupported_expression_YOUR_PREFIX_unsupportedexpression, socketConnectTimeoutInMillis=500 2019-03-28 08:10:34.486 INFO [main] org.jmxtrans.agent.JmxTransAgent - JmxTransAgent started with configuration 'config/jmxtrans.xml' Error: Could not find or load main class kafka.mirrormaker.MirrorMakerWorker

Gk8143197049 commented 5 years ago

Now it says different error : 2019-03-28 09:19:27.725 INFO [main] org.jmxtrans.agent.JmxTransAgent - Starting 'JMX metrics exporter agent: 1.2.4' with configuration 'config/jmxtrans.xml'... 2019-03-28 09:19:27.734 INFO [main] org.jmxtrans.agent.JmxTransAgent - PropertiesLoader: Empty Properties Loader 2019-03-28 09:19:27.894 INFO [main] org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter - GraphitePlainTextTcpOutputWriter is configured with HostAndPort{host='xxxxxxx', port=4756}, metricPathPrefix=metric_prefix, socketConnectTimeoutInMillis=500 2019-03-28 09:19:27.906 INFO [main] org.jmxtrans.agent.JmxTransAgent - JmxTransAgent started with configuration 'config/jmxtrans.xml' Error: Could not find or load main class kafka.mirrormaker.MirrorMakerWorker

xhl1988 commented 5 years ago

What exact command did you use?

Gk8143197049 commented 5 years ago

java -Dlog4j.configuration=file:config/tools-log4j.properties -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=45 -verbose:gc -Xmx1g -Xms1g -XX:+UseG1GC -server -javaagent:./bin/libs/jmxtrans-agent-1.2.4.jar=config/jmxtrans.xml -cp uReplicator-Worker/target/uReplicator-Worker-1.0.0-SNAPSHOT-jar-with-dependencies.jar kafka.mirrormaker.MirrorMakerWorker --consumer.config config/consumer.properties --producer.config config/producer.properties --helix.config config/helix.properties --dstzk.config config/dstzk.properties --topic.mappings config/topicmapping.properties

xhl1988 commented 5 years ago

Can you remove -javaagent:./bin/libs/jmxtrans-agent-1.2.4.jar=config/jmxtrans.xml?

Gk8143197049 commented 5 years ago

I ran the command it gives me the error: ( do I need to keep the -server in the script )

java -Dlog4j.configuration=file:config/tools-log4j.properties -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=45 -verbose:gc -Xmx1g -Xms1g -XX:+UseG1GC -server -cp uReplicator-Worker/target/uReplicator-Worker-1.0.0-SNAPSHOT-jar-with-dependencies.jar kafka.mirrormaker.MirrorMakerWorker --consumer.config config/consumer.properties --producer.config config/producer.properties --helix.config config/helix.properties --dstzk.config config/dstzk.properties --topic.mappings config/topicmapping.properties Error: Could not find or load main class kafka.mirrormaker.MirrorMakerWorker

xhl1988 commented 5 years ago

After you build the package, can you change uReplicator-Distribution/target/uReplicator-Distribution-pkg/bin/start-worker.sh line 120 to exec "$JAVACMD" $JAVA_OPTS -Dapp_name=uReplicator-Worker -cp uReplicator-Distribution/target/uReplicator-Distribution-0.1-SNAPSHOT-jar-with-dependencies.jar -XX:+UseG1GC -XX:+DisableExplicitGC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=85 -XX:+UnlockExperimentalVMOptions -XX:G1MixedGCLiveThresholdPercent=85 -XX:G1HeapWastePercent=5 \

Then start worker with: ./uReplicator-Distribution/target/uReplicator-Distribution-pkg/bin/start-worker.sh --consumer.config ./config/consumer.properties --producer.config ./config/producer.properties --helix.config ./config/helix.properties

Gk8143197049 commented 5 years ago

I see the tcpdump at the target cluster( fro ports 2181 and 9092) . I do not see any data transfer to target cluster?.I am suprised why it is not populating any data. I see data moving into the source topic but not into aws cluster. please check the below logs:

workerlog3282019.txt

xhl1988 commented 5 years ago

I see data moving into the source topic but not into aws cluster. what does that mean?

Gk8143197049 commented 5 years ago

I see the data going from coming into source topic and not into the Aws destination topic

xhl1988 commented 5 years ago

sorry, can you change that line to exec "$JAVACMD" $JAVA_OPTS -Dlog4j.configuration=file:config/tools-log4j.properties -Dapp_name=uReplicator-Worker -cp uReplicator-Distribution/target/uReplicator-Distribution-0.1-SNAPSHOT-jar-with-dependencies.jar -XX:+UseG1GC -XX:+DisableExplicitGC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=85 -XX:+UnlockExperimentalVMOptions -XX:G1MixedGCLiveThresholdPercent=85 -XX:G1HeapWastePercent=5 \ and change rootlogger to debug in config/tools-log4j.properties?

Gk8143197049 commented 5 years ago

Yes. I have changed the logs as per the changes you told me . I do not see any data in the target cluster. below are the logs

The zip file is large . can I drop it at any other place??

xhl1988 commented 5 years ago

Can you upload to your google drive and share with me?

Gk8143197049 commented 5 years ago

I just uploaded the file into googledirve and shared it with your email. Please let me know if you do not have access to it

xhl1988 commented 5 years ago

You producer bootstrap server contains: bootstrap.servers = [kafka01.xxxxxxxxxx.atx.xxxxxxxxxx.com:9092, kafka01.xxxxxxxxxx.aws.xxxxxxxxxx.com:9092]

However, uReplicator also consumers from kafka01.xxxxxxxxxx.atx.xxxxxxxxxx.com,9092

Are the atx ones the same broker?

Gk8143197049 commented 5 years ago

yes, I have a broker installed on onprem alx cluster( with 9092 as broker with producer, consumer) and I want to send data from the topic onto the aws cluster(with broker ,producer and consumers i.e. separate instance).

do I need to change the configuration from producer and consumers?? to send data from source cluster (alx) to target cluster(aws)

xhl1988 commented 5 years ago

So you put your source cluster broker in the producer config?

Gk8143197049 commented 5 years ago

I have put the source cluster and destination cluster in the producer config.. In the consumer config , we only have the source cluster details

Gk8143197049 commented 5 years ago

when I say the source cluster/destination cluster details refering to the broker info.

xhl1988 commented 5 years ago

I believe you have known where the issue is :)

Gk8143197049 commented 5 years ago

No. I am confused where Do I add the source broker info and destination broker info ( either in producer.properties or consumer properties) . This is was my first question when I started the POC. https://github.com/uber/uReplicator/issues/223#issuecomment-472239337

Can you let me know. I will make the changes today and see if any error comes up. Thanks.

Gk8143197049 commented 5 years ago

I gave the consumer properties and producer properties as the alx broker details I am not still not seeing anything moving from the source cluster(alx ) to destination broker( aws)

Thanks

Gk8143197049 commented 5 years ago

Any update??

xhl1988 commented 5 years ago

You should only put the cluster you want to produce to in the producer config.

Gk8143197049 commented 5 years ago

So the producer config will have only Destination Cluster Details. what will the consumer configs contain?Please guide me on this. I can start documenting.

Thanks

xhl1988 commented 5 years ago

Consumer should only contains source cluster info.

xhl1988 commented 5 years ago

Does it work now?