Open IanHoang opened 4 months ago
Hey @IanHoang I can take this issue
Some folks used the scripts I linked and requested for the scripts’ features to eventually be incorporated into OSB. At the time, it seemed like the compare sub-command was the most appropriate place for these features because we were using those scripts to aggregate results across several test records and then compare them with other aggregated results. But on second thought, doing this might couple “aggregating” to “comparing”. We might be better off creating a new subcommand called aggregate. This would isolate the “aggregate” and “compare” abilities, and also make the tool more flexible.
For example, if users are ultimately comparing OS 2.3 with OS 2.4, they might be running several rounds of tests for each version. After running all the rounds of tests, they might do the following:
# Aggregate all rounds of tests related to OS 2.3
opensearch-benchmark aggregate --ids=id1,id2,id3 --output-name=os23 # Outputs to a new id called aggregate-os23
# Aggregate all rounds of tests related to OS 2.4
opensearch-benchmark aggregate --ids=id4,id5,id6 --output-name=os24 # Outputs to a new id called aggregate-os24
# Compares the aggregated results
opensearch-benchmark compare --baseline=aggregate-os23 --contender=aggregate-os24
The idea for an aggregate
command is a good idea. There are several scenarios where something like this would be useful, such as carrying out runs using the same release over several days, or when combining results from a set of loadgen hosts, similar to DWG.
It will be helpful to gather ideas to incorporate into this command as folks comment on this issue.
Is your feature request related to a problem? Please describe. At the moment,
compare
subcommand compares two different test executions and displays the differences to the user. We can extend this feature to support aggregating the results across a series of tests and even converting them into a CSV format.Describe the solution you'd like We could utilize scripts similar to what's seen in these two scripts