Open nioncode opened 4 months ago
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.
Still relevant to me.
Hi @nioncode ! Sorry for not replying, I must have missed it.
This could be caused by any number of issues outside of Locust's control (max connection count on server or worker side, throttling new connections in load balancer etc).
If you can reproduce this in a way that rules out server side issues and I can run myself (like a local nginx instance or something) I'd be happy to take a look.
I probably can't set up such an easy to reproduce setup, since I run this distributed across multiple workers + target VMs in Google Cloud Platform (without a load balancer, directly accessing the VMs over their public IP).
The number of connections etc. should all be the same for the same number of users (right?), since each user uses their own connection.
I probably can't set up such an easy to reproduce setup, since I run this distributed across multiple workers + target VMs in Google Cloud Platform (without a load balancer, directly accessing the VMs over their public IP).
Without a solid way to reproduce this I can't investigate (most of the time these issues are caused by things outside of locusts control - often by there being an actual performance issue in the system you are testing :)
The number of connections etc. should all be the same for the same number of users (right?), since each user uses their own connection.
Yes.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.
Prerequisites
Description
I'm running locust in distributed mode on 4 VMs, with 4 workers each, i.e. 16 workers and store the stats history via the
--csv
option to generate our own graphs from it afterwards. While playing around with locust I noticed that it seems to make a difference how fast users are spawned, i.e. when spawning users faster, the response times are higher (even for the same number of requests/s).E.g. consider these two scenarios:
-r 10
results in a 99% response time of 22ms for 300 users and ~220 requests/s-r 50
results in a 99% response time of 58ms for 450 users and ~220 requests/s (for 300 users there are only ~103 requests/s and still a 99% response time of 60ms)If you check the attached CSV files, you can easily see that the response times are far worse for the
-r 50
test run across almost every dimension, even for very low user / requests/s. When increasing using-r 100
or even higher, this problem gets even worse. Am I doing something wrong or is this expected in some way?test_stats_history_r10.csv test_stats_history_r50.csv
Maybe related question: should there be any difference between the following two scenarios:
I've seen much worse results from using the first option compared to the second one (but the first one mimics our real world use case better). How can I find out what the problem is here, since it seems to be one of locust or the network and not our server (since both result in 5k requests/s)? I have a feeling that it might be that network connections are routinely dropped and re-established, but I have no idea if this is correct or not.
Command line
locust -f test.py --headless --master --expect-workers 16 -u 5000 -r 10 --run-time 60s --csv test -H https://my-host
Locustfile contents
Python version
3.12.3
Locust version
2.29.0
Operating system
Ubuntu 24.04