locustio / locust

Write scalable load tests in plain Python 🚗💨
https://locust.cloud
MIT License
25.06k stars 3k forks source link

Locust slows down over time #876

Closed alexandrul closed 5 years ago

alexandrul commented 6 years ago

Description of issue

Having some doubts about the accuracy of the results obtained during an internal application test (even if the tested application reported consistent response times, Locust reported an ever increasing response time), I have started some tests against plain Nginx endpoints.

Initially, the Nginx results were consistent over time, as expected. However, after moving the endpoints to HTTPS, I have encoutered the same behaviour as in my application tests (please check the attached screenshots).

Expected behavior

The reported response time and RPS should be consistent over time.

Actual behavior

For long-running tests (more than 12 hours), the reported response time is increasing and RPS decreases accordingly.

Environment settings (for bug reports)

Steps to reproduce (for bug reports)

  1. Start a Locust tests against Nginx using HTTPS
  2. Let it run for more than 12 hours
alexandrul commented 6 years ago

locust_nginx_https_01_day1 locust_nginx_https_02_day1 locust_nginx_https_03_day2

alexandrul commented 6 years ago

The certificate chain is generated internally and for the tests I'm using the REQUESTS_CA_BUNDLE env var to specify the bundle file that includes our CA.

Nginx conf file: nginx_conf.txt

In the /api/report/types/ folder there are multiple files, some of them around 190k in order to get a response time of more than one second.

alexandrul commented 6 years ago

And the Locust file: reference_py.txt

cgoldberg commented 6 years ago

are you 100% sure it's not your application that slows down? to remove that possibility, can u run a simpler test..like retrieving static content from Nginx instead?

also, are you monitoring resources on the load generator? does cpu/mem/disk/net/etc looks healthy?

alexandrul commented 6 years ago

@cgoldberg like I've said, the submitted charts are for Nginx only, using 128 users. The target server is capable of handling more users, but I have chosen a lower RPS to start with (the initial issue was encountered during soak testing).

Not only that, but the Nginx + HTTP test is fine, the issue is encountered only for Nginx + HTTPS

Both servers are doing fine, with just a few MB/s network traffic due to the content download. The link between servers being gigabit, this should not be an issue.

cgoldberg commented 6 years ago

I just realized you are running on Windows...

In the Locust docs we note:

"Running Locust on Windows should work fine for developing and testing your load testing scripts. However, when running large scale tests, it’s recommended that you do that on Linux machines, since gevent’s performance under Windows is poor."

alexandrul commented 6 years ago

Please remember that the HTTP test is fine, only by switching to HTTPS I can expose the issue. This is handled by requests while gevent should do exactly the same things in both cases.

cgoldberg commented 6 years ago

sorry.. i can't be of more help.

alexandrul commented 6 years ago

@cgoldberg Your help is highly appreciated; most of the issues were also discussed on the slack channel. I'm doing my best to describe my issue as detailed as I can; I don't expect anyone to start investigating it but, with a bit of luck, someone more experienced than me might be also affected by this issue and may provide some hints that would allow me to pinpoint the real issue.

alexandrul commented 6 years ago

An interesting behavior, always observed after ending a long-running test affected by this issue:

sonny-zhang commented 6 years ago

@alexandrul 我在使用中,也觉得数据的统计做的可能不怎么准确:可以访问服务器,查看application的log信息,然后自己根据一段时间,将数据统计出来,我觉得从服务器日志上获得的数据更加准确,I come from China, Sorry, My English is poor. I'm afraid I can't express myself clearly in English

alexandrul commented 6 years ago

@1fengchen1 thank you for the hint. While I can always check my application's logs and get the stats from that side, I would still need a specific RPS to be generated by Locust.

alexandrul commented 6 years ago

Is there a reasonably easy way to run just the http client part? (same test, single user, without any gevent/greenlet activity)

cyberw commented 5 years ago

Maybe you could try the rps limiting TaskSet in https://github.com/SvenskaSpel/locust-plugins with a high number of locusts, to see if you get a different behaviour.

Closing this due to inactivity, but feel free to keep talkning :)