tsenart / vegeta

HTTP load testing tool and library. It's over 9000!
http://godoc.org/github.com/tsenart/vegeta/lib
MIT License
23.51k stars 1.36k forks source link

How to understand the `Latencies` in the report? #278

Closed cj1128 closed 5 years ago

cj1128 commented 6 years ago
  1. What version of the project are you using?

    go get from github repo, commit hash: f30b7da.

  2. What operating system and processor architecture are you using?

    Mac OS, 10.13.3.

  3. What did you do?

    setup a simple server and use vegeta to load test it.

    package main
    
    import "net/http"
    
    func main() {
        http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
            w.Write([]byte("hello world"))
        })
        http.ListenAndServe(":8080", nil)
    }

    test command:

    echo "GET http://localhost:8080" | vegeta attack -rate RATE -duration=5s | vegeta report
  4. What did you expect to see?

    the bigger the rate, the bigger the latencies.

  5. What did you see instead?

    when the rate is 100:

    Latencies     [mean, 50, 95, 99, max]  1.6292ms, 1.588029ms, 2.887475ms, 3.105071ms, 3.886444ms

    when the rate is 1000:

    Latencies     [mean, 50, 95, 99, max]  431.97µs, 418.454µs, 586.022µs, 774.319µs, 4.518077ms

I think the latency means the time interval betweens request sent and response received, so when the request rate is bigger, the server has more work to do, the latency should be bigger, but the result is opposite.

So, how should I understand the latencies?

tsenart commented 6 years ago

I think the latency means the time interval betweens request sent and response received, so when the request rate is bigger, the server has more work to do, the latency should be bigger, but the result is opposite.

Oh the wonderful world of micro-performance analysis :-) I can reproduce your results and I'm led to believe there's more to this than at first meets the eye. Without any proof whatsoever, I have the feeling we're observing some sort of caching phenomena that influences these results.

To be continued... I don't have a good answer to this yet.

tsenart commented 6 years ago

This happens on Linux too. Raw results files saved and uploaded here for future debugging. results.tar.gz

$ echo "GET http://localhost:8080" | vegeta attack -rate=100 -duration=5s | tee 100.bin | vegeta report
Requests      [total, rate]            500, 100.20
Duration      [total, attack, wait]    4.99074455s, 4.989999756s, 744.794µs
Latencies     [mean, 50, 95, 99, max]  591.146µs, 586.039µs, 738.259µs, 960.467µs, 3.644614ms
Bytes In      [total, mean]            5500, 11.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:

$ echo "GET http://localhost:8080" | vegeta attack -rate=1000 -duration=5s | tee 1000.bin | vegeta report
Requests      [total, rate]            5000, 1000.20
Duration      [total, attack, wait]    4.999451257s, 4.998999759s, 451.498µs
Latencies     [mean, 50, 95, 99, max]  422.181µs, 419.69µs, 490.308µs, 543.746µs, 5.437834ms
Bytes In      [total, mean]            55000, 11.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:5000  
Error Set:

$ echo "GET http://localhost:8080" | vegeta attack -rate=5000 -duration=5s | tee 5000.bin | vegeta report
Requests      [total, rate]            25000, 5000.20
Duration      [total, attack, wait]    5.000013276s, 4.999799679s, 213.597µs
Latencies     [mean, 50, 95, 99, max]  217.194µs, 207.483µs, 236.087µs, 544.931µs, 6.657334ms
Bytes In      [total, mean]            275000, 11.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:25000  
Error Set:

$ echo "GET http://localhost:8080" | vegeta attack -rate=10000 -duration=5s | tee 10000.bin | vegeta report
Requests      [total, rate]            50000, 10000.20
Duration      [total, attack, wait]    5.000069802s, 4.999899817s, 169.985µs
Latencies     [mean, 50, 95, 99, max]  187.518µs, 168.72µs, 223.909µs, 788.912µs, 6.745903ms
Bytes In      [total, mean]            550000, 11.00
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:50000  
Error Set:

$ uname -a
Linux 4.16.10-1-ARCH #1 SMP PREEMPT Mon May 21 07:57:23 UTC 2018 x86_64 GNU/Linux

$ vegeta -version
Version: 7.0.3
Commit: f5e8a53b900b1509b8719aa94c30d4cf087a77e9
Runtime: go1.10.2 linux/amd64
Date: 2018-05-23T15:09:17Z"
tsenart commented 6 years ago

cc @omidaladini

tsenart commented 6 years ago

I've modified the plot reporter to overlay multiple attacks' results with relative time offset. The visualisation matches the text report. Download the archive for interactive analysis on the browser. plot.tar.gz

vegeta-plot

seebs commented 6 years ago

So, CPU frequency changes. Hypothesis: If CPU is "bored" (in a lower-power state and running at a lower clock speed), it's not necessarily going to ramp up to a higher speed immediately upon getting work. If it's consistently handling all incoming requests and ending up with idle time, it might well stay in a lower-power state. It'd be nice to graph this against CPU frequency values, but my guess is that there's a pattern of the CPU spinning up once it gets close to saturated, which means that it will process requests faster when busy.

dgryski commented 6 years ago

So disabling CPU throttling and rerunning the test should give different answers.

tsenart commented 5 years ago

@fate-lovely: The latest experiments in https://gist.github.com/tsenart/e90f11a47b84d8073b5254176f23e7f1 indicate that this is related to either dynamic CPU clock frequency, CPU caching effects or a combination of both.

Load testing your server locally will reveal these effects if your server is doing nothing that takes too long.

cj1128 commented 5 years ago

@tsenart This makes sense, thank you!

seebs commented 5 years ago

FWIW, I did some more testing using a variant: a program with long sleeps, and another program with different code which was running rapidly. the first program's reported latencies dropped sharply when the second program showed up, suggesting CPU frequency rather than cache.