locustio / locust

Write scalable load tests in plain Python 🚗💨
MIT License
24.69k stars 2.96k forks source link

Abnormal Response Time Percentiles Reported After a Load Increase #2665

Closed dilinade closed 2 months ago

dilinade commented 6 months ago

Prerequisites

Description

Issue Description: I've configured a custom load shape in Locust with a specific pattern. Initially, within the first 30 seconds, load is generated by 72 users and then spikes to approximately 136 users, nearly doubling the load, before returning back to the 72 users. However, after the spike, although the number of users returns to 72 (initial level), the response time percentiles are significantly lower compared to the initial normal load phase. This behavior is unexpected as I anticipate the response times to be similar to the initial phase of the normal load.

Steps to Reproduce:

Set up a custom load shape in Locust with the following characteristics: Ramps up to 72 users within the first 30 seconds. Spikes to approximately 136 users, doubling the load. Returns to a stable load of 72 users after the spike. Observe the response time percentiles during the spike and after the spike.

Expected Behavior: Response time percentiles should remain consistent with the initial phase of the normal load throughout the test. I can confirm that it's not originating from the web server application itself.

Actual Behavior: Response time percentiles are significantly lower after the spike in user count, despite the number of users returning to normal levels.

Additional Information:

Using constant_throughtput(2) per user. CURRENT_RESPONSE_TIME_PERCENTILE_WINDOW = 2

image

Command line

locust -f mylocustfile.py

Locustfile contents

import json
from locust import FastHttpUser, TaskSet, task, constant_throughput
from locust import LoadTestShape

class UserTasks(TaskSet):
    @task
    def get_root(self):
        self.client.get("/")

class WebsiteUser(FastHttpUser):
    wait_time = constant_throughput(2)
    tasks = [UserTasks]

class StagesShape(LoadTestShape):
    stages = [
{"duration": 30, "users": 72, "spawn_rate": 30},
{"duration": 60, "users": 136, "spawn_rate": 30},
{"duration": 90, "users": 72, "spawn_rate": 30}
]

    def tick(self):
        run_time = self.get_run_time()

        for stage in self.stages:
            if run_time < stage["duration"]:
                tick_data = (stage["users"], stage["spawn_rate"])
                return tick_data

        return None

Python version

3.8.10

Locust version

2.24.1

Operating system

Ubuntu

cyberw commented 6 months ago

Oh. That looks very strange.

dilinade commented 6 months ago

Thank you for the response.

There's a bit difference when it's set to the default value of 10. However, response latencies remain lower than previous levels.

cyberw commented 6 months ago

Ok, two things:

cyberw commented 4 months ago

@andrewbaldwin44 Did you get around to looking at the issue with the average response times? I completely forgot about this ticket...

andrewbaldwin44 commented 4 months ago

Hey sorry I forgot about this too. The average shown is for the whole test, we use the total_avg_response_time, which is equivalent to the avg_response_time in the last row of the stats (so the aggregated row). It was added in #2509. If we don't want to use the aggregate then what should we use?

cyberw commented 4 months ago

If there’s no ”current_avg_response_time” available then we may need to add one. Its been a while since i looked at the stats handling..

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] commented 2 months ago

This issue was closed because it has been stalled for 10 days with no activity. This does not necessarily mean that the issue is bad, but it most likely means that nobody is willing to take the time to fix it. If you have found Locust useful, then consider contributing a fix yourself!