linkedin / dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
BSD 2-Clause "Simplified" License
131 stars 34 forks source link

Some questions about the result of WorkLoad #95

Closed Drsuperman closed 5 years ago

Drsuperman commented 5 years ago

[mr@redhat143 dynamometer-fat-0.1.5]$ ./bin/start-workload.sh -Dauditreplay.input-path=hdfs:///dyno/audit_input_logs/ -Dauditreplay.output-path=hdfs:///dyno/audit_output_logs/ -Dauditreplay.log-start-time.ms=1554247070151 -Dauditreplay.num-threads=1 -nn_uri hdfs://redhat142:9000/ -start_time_offset 1m -mapper_class_name AuditReplayMapper 2019-04-03 08:08:56,771 INFO com.linkedin.dynamometer.workloadgenerator.WorkloadDriver: The workload will start at 1554250196743 ms (2019/04/03 08:09:56 CST) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/ZDH/parcels/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/mr/dynamometer-fat-0.1.5/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/mr/dynamometer-fat-0.1.5/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/mr/dynamometer-fat-0.1.5/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-04-03 08:09:04,118 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl: Timeline service address: http://redhat143:8188/ws/v1/timeline/ 2019-04-03 08:09:06,712 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to process : 1 2019-04-03 08:09:07,156 INFO org.apache.hadoop.mapreduce.JobSubmitter: number of splits:1 2019-04-03 08:09:07,591 INFO org.apache.hadoop.mapreduce.JobSubmitter: Submitting tokens for job: job_1554243591539_0010 2019-04-03 08:09:07,799 INFO org.apache.hadoop.conf.Configuration: found resource resource-types.xml at file:/etc/zdh/yarn/conf.zdh.yarn/resource-types.xml 2019-04-03 08:09:08,703 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1554243591539_0010 2019-04-03 08:09:08,819 INFO org.apache.hadoop.mapreduce.Job: The url to track the job: http://redhat142:8088/proxy/application_1554243591539_0010/ 2019-04-03 08:09:08,820 INFO org.apache.hadoop.mapreduce.Job: Running job: job_1554243591539_0010 2019-04-03 08:09:28,561 INFO org.apache.hadoop.mapreduce.Job: Job job_1554243591539_0010 running in uber mode : false 2019-04-03 08:09:28,564 INFO org.apache.hadoop.mapreduce.Job: map 0% reduce 0%

[mr@redhat143 ~]$ hdfs dfs -ls /dyno/audit_output_logs Found 2 items -rw-r----- 3 mr users 0 2019-04-03 16:49 /dyno/audit_output_logs/_SUCCESS -rw-r----- 3 mr users 295 2019-04-03 16:49 /dyno/audit_output_logs/part-r-00000 [mr@redhat143 ~]$ hdfs dfs -cat /dyno/audit_output_logs/part-r-00000 mr,READ,OPEN,-1,-3812586358584700876 mr,WRITE,CREATE,-1,-3089982714344429856 mr,WRITE,DELETE,67108863,357792779 mr,WRITE,MKDIRS,8796093022207,943151732492469 mr,WRITE,RENAME,70368744177663,521322769738249 mr,WRITE,SETPERMISSION,-1,-6855717651319934733 mr,WRITE,SETREPLICATION,16777215,161654944

I have some questions:

  1. Is part-r-00000 the result of WorkLoad?
  2. What does the result mean? And why are there negative numbers?

I'm looking forward to your reply. Thanks!

csgregorian commented 5 years ago

Hey @Drsuperman! Yep, part-r-00000 is the result of the benchmark: each output line is in the format user,type,operation,numops,cumulativelatency.

I messed up the implementation of the tracking for numops and cumulativelatency in #92 however, and the numbers are wrong (there shouldn't be any negative numbers): I've got a fix up at #96 (along with the output format being documented); once that's released, the output should be accurate :)

Drsuperman commented 5 years ago

I've got it! Thanks for your reply and job :) @csgregorian