dom96 / httpbeast

A highly performant, multi-threaded HTTP 1.1 server written in Nim.
MIT License
450 stars 53 forks source link

Manually form the response string #63

Closed ire4ever1190 closed 2 years ago

ire4ever1190 commented 2 years ago

used hottie to profile the example from the readme (compiled with nim c --cc:clang --debugger:native -d:release --passL:"-no-pie" beast) and got this output

Running objdump...
Starting 1 threads
 samples           time   percent what
    2470     1978.944ms   88.435% 
      26       20.831ms    0.931% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/sysstr.nim:154
      17       13.620ms    0.609% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc_common.nim:312
      16       12.819ms    0.573% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/strutils.nim:2673
      16       12.819ms    0.573% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast/parser.nim:119
      14       11.217ms    0.501% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/sysstr.nim:172
      14       11.217ms    0.501% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/comparisons.nim:179
      12        9.614ms    0.430% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast.nim:235
      11        8.813ms    0.394% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc.nim:293
      11        8.813ms    0.394% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/strutils.nim:2726
       8        6.410ms    0.286% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/sysstr.nim:173
       8        6.410ms    0.286% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc_common.nim:320
       7        5.608ms    0.251% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc.nim:759
       6        4.807ms    0.215% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc.nim:307
       6        4.807ms    0.215% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast/parser.nim:120
       5        4.006ms    0.179% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/sysstr.nim:206
       5        4.006ms    0.179% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast.nim:384
       4        3.205ms    0.143% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/strutils.nim:2674
       4        3.205ms    0.143% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast/parser.nim:8
       4        3.205ms    0.143% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/alloc.nim:779
       3        2.404ms    0.107% /home/me/.nimble/pkgs/httpbeast-0.3.0/httpbeast.nim:234
       3        2.404ms    0.107% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/strutils.nim:2672
       3        2.404ms    0.107% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/ioselects/ioselectors_epoll.nim:501
       3        2.404ms    0.107% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/ioselects/ioselectors_epoll.nim:161
       3        2.404ms    0.107% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/alloc.nim:747
       3        2.404ms    0.107% /tmp/httpbeast.nim:279
       2        1.602ms    0.072% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/gc.nim:297
       2        1.602ms    0.072% /home/me/.choosenim/toolchains/nim-1.6.0/lib/system/sysstr.nim:227
       2        1.602ms    0.072% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/ioselects/ioselectors_epoll.nim:488
       2        1.602ms    0.072% /home/me/.choosenim/toolchains/nim-1.6.0/lib/pure/strutils.nim:2727
Samples per second: 1248.1 totalTime: 2.238ms

What stood out was the strutils line which I believe relates to this part of httpbeast. Using this benchmark to see if manually doing the response does anything

import benchy
import strutils

const serverInfo = "HttpBeast"
let 
    code = 200
    body = "hello world"
    serverDate = "02-20-2020"
    otherHeaders = ""

const reps = 100_000

timeIt "% formatting":
    for i in 0..reps:
        let text =  "HTTP/1.1 $#\c\L" & "Content-Length: $#\c\LServer: $#\c\LDate: $#$#\c\L\c\L$#" % [$code, $body.len, serverInfo, serverDate, otherHeaders, body]
        keep text

timeIt "concating":
    for i in 0..reps:
        var text = ""
        text &= "HTTP/1.1 "
        text &= $code
        text &= "\c\LContent-Length: "
        text &= $body.len
        text &= "\c\LServer: " & serverInfo
        text &= "\c\LDate: "
        text &= serverDate
        text &= otherHeaders
        text &= "\c\L\c\L"
        text &= body
        keep text

we get

name ............................... min time      avg time    std dv   runs
% formatting ...................... 62.264 ms     62.737 ms    ±0.330    x80
concating ......................... 23.907 ms     24.480 ms    ±0.059   x205

So quite an improvement. Now to benchmark actual http performance compile command: nim c -d:release beast.nim nim file: just the file from the readme wrk command: wrk --threads=1 http://127.0.0.1:8080 -d 10

(run shown is from most average out of 4 runs)

Before

Running 10s test @ http://127.0.0.1:8080
  1 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    69.30us   30.43us   1.14ms   92.87%
    Req/Sec   136.87k    18.53k  150.67k    94.00%
  1359811 requests in 10.00s, 137.46MB read
Requests/sec: 135971.47
Transfer/sec:     13.75MB

after

Running 10s test @ http://127.0.0.1:8080
  1 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    58.56us   27.79us   2.24ms   95.24%
    Req/Sec   149.61k    21.83k  157.42k    92.00%
  1488251 requests in 10.00s, 150.45MB read
Requests/sec: 148819.33
Transfer/sec:     15.04MB

seems like a nice performance boost

dom96 commented 2 years ago

Very epic. Benchmarked on my machine, improves QPS from 870k to 1.2million!