opensearch-project / opensearch-benchmark

OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch
https://opensearch.org/docs/latest/benchmark/
Apache License 2.0
111 stars 79 forks source link

add relative standard deviation to aggregated test execution metrics #681

Closed OVI3D0 closed 1 month ago

OVI3D0 commented 1 month ago

Description

Adds relative standard deviation to aggregated test result metrics. Users can now see the spread of their OSB test results. (RSD is currently only calculated for the mean)

Example:

   {
    "task": "wait-until-merges-finish",
    "operation": "wait-until-merges-finish",
    "throughput": {
     "min": 74.41136845553248,
     "mean": 74.41136845553248,
     "median": 74.41136845553248,
     "max": 74.41136845553248,
     "unit": "ops/s",
     "mean_rsd": 5.7260883182218825
    },
    "latency": {
     "100_0": 12.996128527447581,
     "mean": 12.996128527447581,
     "unit": "ms",
     "mean_rsd": 5.875897232448762
    },
    "service_time": {
     "100_0": 12.996128527447581,
     "mean": 12.996128527447581,
     "unit": "ms",
     "mean_rsd": 5.875897232448762
    },
    "client_processing_time": {
     "100_0": 0.6900834268890321,
     "mean": 0.6900834268890321,
     "unit": "ms",
     "mean_rsd": 1.172341871616217
    },
    "processing_time": {
     "100_0": 13.791589008178562,
     "mean": 13.791589008178562,
     "unit": "ms",
     "mean_rsd": 5.595395107897675
    },
    "error_rate": 0.0,
    "error_rate_rsd": 0,
    "duration": 3.3562859753146768,
    "duration_rsd": 1.6848600926145232
   },

Issues Resolved

662

Testing

make test


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

OVI3D0 commented 1 month ago

You may want to change the 100_0 to max in the results summary, either in this or a future check-in.

Should this be done for regular test executions as well? 100_0 is used across regular test executions as well. Example:

    "latency": {
     "50_0": 206.59688568115234,
     "100_0": 235.18186950683594,
     "mean": 203.5783863067627,
     "unit": "ms"
    },