Variable Rate Pacing (concrete proposal + initial implementation)

blakewatters commented 4 years ago

Proposal

This issue proposes to introduce a new Pacer implementation that varies the load based on specific target rates + duration pairs. In order to deliver such an experience within the Vegeta CLI, this issue also raises the question of how to expose the configuration of Pacer objects as arguments.

From a product perspective, what we would like is to:

Select a Pacer from the CLI
Express Pacer configuration via a flag
Support configuration of the Pacer via CSV and JSON files in order to support dynamic or recorded workloads

Just spitballing here to get the ball rolling, but here are some Vegeta incantations I can imagine:

# Configure the pace directly on the CLI
$ echo "GET https://github.com/tsenart/vegeta" | vegeta attack --pacer variable --pace "30s@50, 1m@200, 2m@1000, 30s@100, 2h@25"

# Configure the pace from a CSV file
$ cat pace.csv
10, 5s
20, 5s
30, 10s
$ echo "GET https://github.com/tsenart/vegeta" | vegeta attack --pacer variable --pace-file pace.csv

# Configure from a JSON file
$ cat pace.json
[
  { "rate": 10, "duration": "5s"},
  { "rate": 20, "duration": "5s"},
  { "rate": 30, "duration": "10s"}
]
$ echo "GET https://github.com/tsenart/vegeta" | vegeta attack --pacer variable --pace-file pace.json

There is a proof of concept implementation available that utilizes Vegeta as a library at opsani/vegeta-varload but this is literally the first Golang code I have ever written so please be gentle in your review. :-)

The implementation has the current experience:

$ go run vegeta-varload.go https://golang.org/                                                    ✔  1310  22:43:58
🚀  Start variable load test against https://golang.org/ with 6 load profiles for 44 total seconds
💥  Attacking at rate of 10 req/sec for 5s
Requests      [total, rate, throughput]         59, 9.95, 9.83
Duration      [total, attack, wait]             6s, 5.928s, 71.249ms
Latencies     [min, mean, 50, 90, 95, 99, max]  67.972ms, 209.784ms, 233.545ms, 241.111ms, 244.046ms, 245.575ms, 245.59ms
Bytes In      [total, mean]                     653189, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:59
Error Set:
💥  Attacking at rate of 20 req/sec for 5s (6.031s elapsed)
Requests      [total, rate, throughput]         125, 25.43, 25.07
Duration      [total, attack, wait]             4.986s, 4.916s, 69.432ms
Latencies     [min, mean, 50, 90, 95, 99, max]  67.166ms, 99.857ms, 78.29ms, 150.344ms, 153.171ms, 156.045ms, 162.767ms
Bytes In      [total, mean]                     1383875, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:125
Error Set:
💥  Attacking at rate of 30 req/sec for 10s (11.051s elapsed)
Requests      [total, rate, throughput]         325, 32.72, 32.46
Duration      [total, attack, wait]             10.013s, 9.933s, 80.493ms
Latencies     [min, mean, 50, 90, 95, 99, max]  67.155ms, 198.684ms, 191.786ms, 330.651ms, 447.01ms, 797.876ms, 831.366ms
Bytes In      [total, mean]                     3598075, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:325
Error Set:
💥  Attacking at rate of 40 req/sec for 10s (21.087s elapsed)
Requests      [total, rate, throughput]         436, 44.44, 44.13
Duration      [total, attack, wait]             9.881s, 9.811s, 69.249ms
Latencies     [min, mean, 50, 90, 95, 99, max]  68.291ms, 118.652ms, 113.567ms, 197.586ms, 199.078ms, 204.052ms, 205.343ms
Bytes In      [total, mean]                     4826956, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:436
Error Set:
💥  Attacking at rate of 50 req/sec for 5s (31.001s elapsed)
Requests      [total, rate, throughput]         204, 41.20, 40.54
Duration      [total, attack, wait]             5.032s, 4.952s, 80.423ms
Latencies     [min, mean, 50, 90, 95, 99, max]  67.98ms, 86.07ms, 82.515ms, 109.16ms, 113.397ms, 121.839ms, 133.459ms
Bytes In      [total, mean]                     2258484, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:204
Error Set:
💥  Attacking at rate of 60 req/sec for 8s (36.057s elapsed)
Requests      [total, rate, throughput]         510, 63.46, 62.91
Duration      [total, attack, wait]             8.107s, 8.037s, 70.542ms
Latencies     [min, mean, 50, 90, 95, 99, max]  68.608ms, 123.846ms, 120.376ms, 183.906ms, 187.68ms, 192.614ms, 275.022ms
Bytes In      [total, mean]                     5646210, 11071.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:510
Error Set:
✨  Attack completed in 44.164s

Background

The driving use-case is working with autoscaling workloads where specific step-function type variance of the load is of particular interest. For example, within a given attack we may wish to oscillate the load at fixed intervals between thresholds such as 1000rpm for 10 minutes, 2000rpm for 5 minutes, 3000rpm for 15 minutes, 200rpm for 2 hours, and 50rpm for 30 minutes in order to describe an auto-scaling workload and observe the resource utilization and responsiveness within these durations.

Workarounds

We have looked at the load ramping script and scripting the Vegeta CLI is by no means off the table but the recent inclusion of the Pacer interface seems to indicate that there may be interest/appetite for direct support within Vegeta.

blakewatters commented 4 years ago

On a related API note, I had to resort to globals in order to track state. It would be nice if the Pace function was called by reference instead of value so I could put the state into the struct. Maybe there is a better path but felt this pain so throwing it out there.

I suspect that most folks who will reach for Vegeta as a library will have squirrely stateful requirements and want to hang state on the pacer, tag requests/responses with context, handle multiple attacks/metrics, etc so little API affordances can really help

tsenart commented 4 years ago

I like your proposed CLI interface, but I'm not sure the text report makes total sense for a variable rate of attack, so I'm hesitant to pursue this.

blakewatters commented 4 years ago

In playing with this a bit more, there are a few interesting CLI interfaces. My original proposal specified the exact steps but was crucially lacking a slope to define the rate evolution across the steps. There is another interesting mode where you define the only the array of rates, a slope, and the overall duration and we just plot a curve to fit.

The text report was just an interface to test out the pacer. Don't care about it.

Even if you just want to close this issue out, I do think that passing the Pace by reference is a cheap change that will enable interesting implementations

tsenart commented 4 years ago

Even if you just want to close this issue out, I do think that passing the Pace by reference is a cheap change that will enable interesting implementations

Can you explain what you mean? Attack takes a Pacer interface, so you can already pass in a pointer to a type that implements it.

nfsp3k commented 4 years ago

I love this proposal. Other load generators such as JMeter and Locust already support rate pacing, however, no one guarantees the rate that I wanted, and the actual rate is affected by the performance of the testing target system. I hope to see this proposal merged to the main branch.

tsenart / vegeta