This is related to the scale testing RFC. For more details, see the RFC here.
To see other experiments in this analysis, see the META issue.
In this experiment we want to address the following questions:
Do search clients in OSB properly simulate actual clients in a client-server model?
For situations where workers have more than one search client, does OSB still properly simulate clients in a client-server model?
During a test, the Worker Coordinator Actor provisions and coordinates a number of Worker Actors that are responsible for driving requests to the SUT. These Worker Actors are allocated a number of clients to perform steps (also known as tasks or operations in a workload). It’s worth mentioning that the number of Worker Actors is determined by the number of CPU cores or vCPUs that the host running OSB has.
The two tables listed below (Autoscaling Group with OpenSearch Benchmark set to a single client on each EC2 instance and Load Generation Host with OpenSearch Benchmark) are two series of experiments to determine if a single load generation host can simulate the same performance as set of instances that all act as a single independent clients.
To reduce discrepancies, we ensure that experiments in Table 2: Load Generation Host with OpenSearch Benchmark has no more than 1 client assigned per worker actor. This can be seen with how the number of clients is always equal to or less than the number of vCPUs. This would match how each client or instance in the ASG in Table 1: Autoscaling Group with OpenSearch Benchmark always will use only one vCPU (even though they each will have 2 vCPUs).
Table 1: Determine Performance of an Autoscaling Group of N instances of OpenSearch Benchmark where search_clients = 1
Autoscaling Group with OpenSearch Benchmark
Clients
Instance Type
Instance Count
vCPUs
Memory (GB)
Round 1
8
c5.large
8
16
32
Round 2
16
c5.large
16
32
64
Round 3
32
c5.large
32
64
128
In the table above, the gradual increase in instance count of the same instance type implies that there is a gradual progression of search clients. Each instance will be running OSB with one search client. When all the instances have finished running OSB, we can use a script to aggregate the results for service time across all instances in the ASG.
Table 2: Determine Performance of a Single Load Generation Host with OpenSearch Benchmark where search_clients = N
LG Hosts with OpenSearch Benchmark
Simulated Clients (search_clients:N)
Instance Type
Instance Count
vCPUs
Memory (GB)
Round 1
8
c5.2xlarge
1
8
16
Round 2
16
c5.4xlarge
1
16
32
Round 3
32
c5.9xlarge
1
36
72
In the table above, there will only be a single load generation host.
After running experiments from Table 1 & 2, we should perform a comparison.
Table 3: Load Generation Host with OpenSearch Benchmark where search_clients = N & More Clients Per Worker
LG Hosts with OpenSearch Benchmark
Simulated Clients (search_clients:N)
Instance Type
Instance Count
vCPUs
Memory (GB)
Clients Per Worker Actor
Round 1
8
c5.large
1
2
16
4
Round 2
16
c5.large
1
2
16
8
Round 2
32
c5.large
1
2
16
16
Knowing how worker actors can be allocated more than one client, we should also rerun the load generation host with OpenSearch Benchmark but in a way where more clients are allocated to a worker actor, as seen in Table 3: Load Generation Host with OpenSearch Benchmark and More Clients Per Worker. This will confirm if we adding more clients to a worker (running with a smaller instance type where there are less CPU cores) can simulate the same performance where one client is assigned to one worker. In round 1 in the table above, we should expect to see two workers (since there are two vCPUs) with 4 clients each. In round 2 in the table above, we should see two workers with 8 clients each. We can compare them with the results from Table 2 (where we tested the same configurations but kept 1 client per worker). If we see no degradation here, scaling investigation 2 should offer stress the load generation host and help us determine what the max clients allowed per worker is.
The term query above is considered a fast query in the Big5 workload and can be used for our experiment.
Metrics to Analyze
With each round of tests, we’ll be comparing the metrics — such as query throughput and service time — seen in both clients from the ASG and the load generation host. We’ll also be monitoring the resource utilization in the ASGs, load generation host, and the system-under-test. If the system-under-test shows signs of resource bottlenecks, we will scale it out and rerun the numbers to ensure that the test results are not skewed.
Why are we not using latency?
OSB’s definition of latency is slightly different from the colloquial definition of latency. In OSB, when a user specifies a target throughput to achieve with the target-throughput parameter, latency is the service time plus the time that the request spends waiting in the queue. When OSB’s parameter target-throughput is not set, service time and latency are equivalent. The design of this parameter is for users who want to achieve a specific target-throughput, which might be for different reasons such as simulating target-throughput seen in their production clusters. Based on these reasons, for these experiments, we will not be setting target-throughput and the clients (in ASG and OSB) will send the queries as fast as possible. Therefore, we will be primarily focusing on service time as it should be equivalent to latency. For more information, see this article from OSB’s documentation.
Experiment 1:
This is related to the scale testing RFC. For more details, see the RFC here.
To see other experiments in this analysis, see the META issue.
In this experiment we want to address the following questions:
During a test, the Worker Coordinator Actor provisions and coordinates a number of Worker Actors that are responsible for driving requests to the SUT. These Worker Actors are allocated a number of clients to perform steps (also known as tasks or operations in a workload). It’s worth mentioning that the number of Worker Actors is determined by the number of CPU cores or vCPUs that the host running OSB has.
The two tables listed below (Autoscaling Group with OpenSearch Benchmark set to a single client on each EC2 instance and Load Generation Host with OpenSearch Benchmark) are two series of experiments to determine if a single load generation host can simulate the same performance as set of instances that all act as a single independent clients.
To reduce discrepancies, we ensure that experiments in Table 2: Load Generation Host with OpenSearch Benchmark has no more than 1 client assigned per worker actor. This can be seen with how the number of clients is always equal to or less than the number of vCPUs. This would match how each client or instance in the ASG in Table 1: Autoscaling Group with OpenSearch Benchmark always will use only one vCPU (even though they each will have 2 vCPUs).
Table 1: Determine Performance of an Autoscaling Group of N instances of OpenSearch Benchmark where
search_clients = 1
In the table above, the gradual increase in instance count of the same instance type implies that there is a gradual progression of search clients. Each instance will be running OSB with one search client. When all the instances have finished running OSB, we can use a script to aggregate the results for service time across all instances in the ASG.
Table 2: Determine Performance of a Single Load Generation Host with OpenSearch Benchmark where
search_clients = N
In the table above, there will only be a single load generation host.
After running experiments from Table 1 & 2, we should perform a comparison.
Table 3: Load Generation Host with OpenSearch Benchmark where
search_clients = N
& More Clients Per WorkerKnowing how worker actors can be allocated more than one client, we should also rerun the load generation host with OpenSearch Benchmark but in a way where more clients are allocated to a worker actor, as seen in Table 3: Load Generation Host with OpenSearch Benchmark and More Clients Per Worker. This will confirm if we adding more clients to a worker (running with a smaller instance type where there are less CPU cores) can simulate the same performance where one client is assigned to one worker. In round 1 in the table above, we should expect to see two workers (since there are two vCPUs) with 4 clients each. In round 2 in the table above, we should see two workers with 8 clients each. We can compare them with the results from Table 2 (where we tested the same configurations but kept 1 client per worker). If we see no degradation here, scaling investigation 2 should offer stress the load generation host and help us determine what the max clients allowed per worker is.
Term Query
The term query above is considered a fast query in the Big5 workload and can be used for our experiment.
Metrics to Analyze
With each round of tests, we’ll be comparing the metrics — such as query throughput and service time — seen in both clients from the ASG and the load generation host. We’ll also be monitoring the resource utilization in the ASGs, load generation host, and the system-under-test. If the system-under-test shows signs of resource bottlenecks, we will scale it out and rerun the numbers to ensure that the test results are not skewed.
Why are we not using latency?
OSB’s definition of latency is slightly different from the colloquial definition of latency. In OSB, when a user specifies a target throughput to achieve with the target-throughput parameter, latency is the service time plus the time that the request spends waiting in the queue. When OSB’s parameter target-throughput is not set, service time and latency are equivalent. The design of this parameter is for users who want to achieve a specific target-throughput, which might be for different reasons such as simulating target-throughput seen in their production clusters. Based on these reasons, for these experiments, we will not be setting target-throughput and the clients (in ASG and OSB) will send the queries as fast as possible. Therefore, we will be primarily focusing on service time as it should be equivalent to latency. For more information, see this article from OSB’s documentation.