tag1consulting / goose

Load testing framework, inspired by Locust
https://tag1.com/goose
Apache License 2.0
793 stars 70 forks source link

[Feature request] The ability to specify request per second #87

Open JordiPolo opened 4 years ago

JordiPolo commented 4 years ago

It will be useful to be able to specify load in other ways, not just users. At least in the environment I am in we look at response times always with respect to req/s coming to the system. Something like 99th of my response time is no larger than Xms given I get xx req/s or less.

Tweaking the users parameter can approximate the desired req/second but it is manual tuning and not portable between machines. It would be fantastic if I could pass a param to Goose and it tries to more or less keep around the number given.

I am not sure how to deal with big numbers, if an user asks for creating 100,000 req/s does Goose start gazillions of threads trying to accomodate? So maybe this feature is limited to lowish numbers, 50req/s or something like that, at least on a first iteration.

JordiPolo commented 4 years ago

As anecdote of non-portability of thread count, I run a goose session from my laptop to test a container in the cloud and was good. Then run with same parameters from another container in the same network and the system under test died. I believe the more powerful container and faster network caused much more load than the system under test could handle.

jeremyandrews commented 4 years ago

Interesting idea.

It would require that statistics be enabled, as only then do individual GooseUser threads push the necessary information to the parent thread. The parent could therefore measure req/s and start additional GooseUser threads if the number is too low, and pause GooseUser threads if the number is too high.

However, it gets quite complicated when you consider load tests with multiple GooseTaskSets. Goose starts each GooseTaskSet in the order they are defined (this was required to ensure consistent load between multiple runs with the same startup options) -- auto-generating load could lead to some GooseTaskSet's not running at all. But this is also true when running manually, so in itself isn't a regression. I've been intending to add a warning log message when a load test doesn't invoke all task sets, or starts a number of task sets that doesn't match configured weights.

When throttling the load test (ie, if req/s is too high) I imagine it would be done in the reverse order that clients were started -- ie, first throttle the highest thread number, measure for a while, then throttle the next highest thread number, etc.

Also, different task sets could impact req/s more profoundly. For example, if a task set was just loading static assets it would result in drastically more req/s than a task_set loading dynamically generated pages. Auto enabling/disabling a task like this could result in huge fluctuations.

Beyond that, wait_time could make req/s bursty. For example, if you set a very high wait_time(60,65).

I imagine there'd be a few options required: 1) the desired requests/sec, 2) the frequency we tune toward this value, and 3) the % of accuracy required -- this would allow you to tune things to workaround the discrepancies discussed above.

jeremyandrews commented 4 years ago

(BTW: I'm going to be offline for a little more than week, so I won't be able to look into implementing this for a while.)

jeremyandrews commented 4 years ago

As a first step toward this goal, we'll add a rate limiter, see #90 and #91.

LionsAd commented 4 years ago

You can basically implement a simple token bucket filter yourself in your task function:

you need a Thread for it (maybe we need to add a way to schedule the main goose task that spawns the Tokio threads as well?):

Writer thread:

That should automatically rate limit.

JordiPolo commented 4 years ago

@jeremyandrews those new issues sound like a good next step to me

jeremyandrews commented 4 years ago

@JordiPolo #97 has landed, I'd welcome any feedback as to if it proves helpful in solving your immediate problem. You'll need to compile the latest code from Github as there's a bit more work before we're able to release 0.9.0 with this change. (Specifically, task function signatures will change, returning a Result -- hopefully this will come together quickly).

Thanks for your reviews and feedback in that PR!

JordiPolo commented 4 years ago

Thanks so much. I've just updated to master, will test it today. Only noticeable change is that goose_send returns a Result which sounds right to me but I think eventually task! and register_task would need to also use Result , else as in the example, I need to ignore the error which sounds weird in Rust I guess. Alternatively, if you do not think it provides value to users of these functions, you could ignore the error inside these functions.

JordiPolo commented 4 years ago

In general works well, thanks so much! A few comments:

Running the same test for 4 minutes gets me an average of 18. So yeah, I think it is the time shutting down which has lower number of request and affects the average.

jeremyandrews commented 4 years ago
  • I was just doing goose_send().unwrap() because I thought that posible error would not raise but it does in normal operation, at least it did everytime I've tried, which is surprising.

This is expected. In order to use Tokio we went with a leaky bucket queue implementation. To shut that down, we simply close the throttle channel when the load test finishes: this results in errors (which can safely be ignored, but unwrap won’t work). This means errors ALWAYS happen if you’re using the throttle.

The next step is to allow the ? symbol to unwrap the response in tasks, but that PR is not yet written so for now look at the pattern in the drupal_loadtest.rs example for how to properly handle.

  • With low req/s and lots of tasks will get mostly zeros in the table, also I asked for 5 req/s and it prints as 4, but I suspect I am getting 4.98 or something like that, it is just truncated when printed.

Yes, the statistics need improvements to better handle rounding and truncation. You are correct as to what is happening. It’s less problematic with larger values. (#101)

  • Asking for max of 20 req/s , the tables that are printed while the process is running were giving me 19 req/s as per the above comment, but then the final table with the results was given me 17 req/s . I was running the test for only 1 minute, maybe the time it takes to wait for all threads to finish is enough to make a dent on total req/s. There were only 1300 total request. So this probably is not much that can be done, maybe somehow expedite the cleanup piece.

A longer test may help, as it averages out.

More users may help (but not necessarily), as when using too few users you may not be able to generate the desired load.

Resetting statistics after all users are started may help (a runtime option), as during startup you’re not achieving maximum load.

Currently the delay is naive — it throttles for the same amount of time after each request, ideally it should subtract the time taken to process a request from the throttle delay to avoid a slow drift. (If you enable the statistics logs you can see this in the timestamps.) (#102)

And finally: it imposes a maximum number of requests only, the above are just some of the ways the actual requests can end up less than this maximum: but less than or equal to the maximum/throttle is our goal so this suggests it’s working as intended.

jeremyandrews commented 4 years ago

Only noticeable change is that goose_send returns a Result which sounds right to me but I think eventually task! and register_task would need to also use Result , else as in the example, I need to ignore the error which sounds weird in Rust I guess.

Correct. Making task functions return a result is the last big thing to complete before we can tag a 0.9.0 release. The interim solution is what you’ll find if you review examples/drupal_loadtest.rs. (See how we return early: the goal is instead to just use ? which will simplify that logic.)

JordiPolo commented 4 years ago

Thanks for the explanations. My only concern then is that then seems the plan is that task functions will always return Err when using throttling while not be so when not using throttling.

Do not get me wrong, this does solve my use-case and I do not care ignoring the error but there is nothing I should do with this error, maybe is best I do not know about it.

jeremyandrews commented 4 years ago

Yes, understood. Consider it a work in progress. It's going to get easier to simply ignore this (and future) errors soon, and more error cases will get added.

JordiPolo commented 4 years ago

If there is any testing I can do, please let me know