linkedin / dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
BSD 2-Clause "Simplified" License
131 stars 34 forks source link

Modify AuditReplay workflow to output count and latency of operations #92

Closed csgregorian closed 5 years ago

csgregorian commented 5 years ago

Created a composite output type, CountTimeWritable, to track the count in addition to the cumulative time of operations. As well, adds another level of granularity through the operation name instead of just the type. This gives more insight into per-operation behaviour, and makes it possible to calculate the average latency per operation instead of just the total.

EDIT: also changes the separator to a , for better tabular representation

The output from a workflow changes as follows:

Before

hdfs,WRITE  590

After

hdfs,WRITE,CREATE,1,565
hdfs,WRITE,MKDIRS,7,14
hdfs,WRITE,RENAME,1,11
csgregorian commented 5 years ago

Fixed up, edited PR to match.

csgregorian commented 5 years ago

Fixed up 👍