pravega / pravega-benchmark

Performance benchmark tool for Pravega
Apache License 2.0
8 stars 22 forks source link

Issue 87: Add Throughput Recorder #88

Closed maddisondavid closed 4 years ago

maddisondavid commented 4 years ago

Change log description Adds an extra performance recorder to record Throughput while the test is running

Purpose of the change Fixes #87

What the code does Adds a Throughput writer to the Performance recorder that records the number of bytes every second. This is crucial for performing Historical Reader performance tests. At the end of the test the throughput percentiles are reported in a human readable format.

Also adds two new options for specifying a CSV file of the throughput for each second the test is running:

 -readthroughputcsv <arg>            CSV file to record read throughput
 -writethroughputcsv <arg>           CSV file to record write throughput
 -reportingIntervalMillis <arg>      period (in milliseconds) in which
                                     performance will be reported

Recording Throughput The CSVThroughput recorder records the Throughput (in MiB/) using the current TimeWindow. If no reportingIntervalMillis is specified, the default value of 5 seconds is used.

How to verify If the readthroughputcsv or writethroughputcsv options are specified then the bytes per reporting period are written to a CSV file for every second the test is running:

> pravega-benchmark -controller tcp://localhost:9090 -consumers 1 -scope test -stream test1 -segments 5 -time 10 -readthroughputcsv throughput.csv -reportingIntervalMillis 1000
...
...
> cat throughput.csv
Start,End,Events,Reading Events Throughput,Bytes,Reading MiB Throughput
1578929507066,1578929508069,159641,159163.51,63856400,60.72
1578929508069,1578929509070,159106,158947.05,63642400,60.63
1578929509070,1578929510071,159947,159787.21,63978800,60.95
1578929510071,1578929511072,254130,253876.12,101652000,96.85
1578929511072,1578929512073,185718,185532.47,74287200,70.78
1578929512073,1578929513074,239951,239711.29,95980400,91.44
1578929513074,1578929514075,267084,266817.18,106833600,101.78
1578929514075,1578929515076,263968,263704.3,105587200,100.6
1578929515076,1578929516083,246124,244413.11,98449600,93.24

Signed-off-by: David Maddison david.maddison@dell.com

RaulGracia commented 4 years ago

@maddisondavid I have clones your branch and tried this command:

bin\pravega-benchmark -controller tcp://localhost:9090 -consumers 1 -producers 1 -events 100 -size 10 -stream test1 -time 60 -readthroughputcsv read.csv -writethroughputcsv write.csv

After the execution of this command, I do see the read.csv file but I do not see the write.csv. Is this because we only can create CSV for either write or read, or maybe this PR has introduced some problem to the -writethroughputcsv option?

The same happens if I use in addition -readcsv and -writecsv. I only see the "read" latency file being created, and none of the other expected files (read and write throughput, write latency):

bin\pravega-benchmark -controller tcp://localhost:9090 -consumers 1 -producers 1 -events 100 -size 10 -stream test1 -time 60 -writethroughputcsv write.csv -readcsv read -writecsv write

Cannot we have more than one CSV reporter at a time?

maddisondavid commented 4 years ago

I wasn't aware of any changes to the existing readcsv and writecsv options introduced as part of this issue, however I will check the code again and validate. As you point out, it should be possible to use both at the same time.

maddisondavid commented 4 years ago

This seems to have been a change introduced in commit 169fa2bc68d62ccd7484ee053e0a4e864c3a59d3 back in April 2019 https://github.com/pravega/pravega-benchmark/blob/master/src/main/java/io/pravega/perf/PravegaPerfTest.java#L301-L307

                writeAndRead = consumerCount > 0;

                if (writeAndRead) {
                    produceStats = null;
                } else {
                    produceStats = new PerfStats("Writing", reportingInterval, messageSize, writeFile, writeThroughputFile);
                }

This means that the producerStats will only be created IF we have producers and NO consumers. It seems to be a deliberate change because it's also repeated when the PerfStats are started, however I don't know why it was changed this way. https://github.com/pravega/pravega-benchmark/blob/master/src/main/java/io/pravega/perf/PravegaPerfTest.java#L339-L346

        public void start(long startTime) throws IOException {
            if (produceStats != null && !writeAndRead) {
                produceStats.start(startTime);
            }
maddisondavid commented 4 years ago

@RaulGracia I've fixed the conflicts, the change to only write the CSV files for reads or writes was made by a previous commit (not this branch). If we want to revert it then I think that's best handled by another PR.

In terms of the constants, if you want me to change it I will, but I don't think it would add anything (they're only used as labels).