add cowboy and elli - Githubissues

ghost commented 6 years ago

cowboy 2.0-pre seems to be performing worse than both plug and phoenix (they use cowboy 1.1) on my laptop, something's not right. Maybe it's a bit too early to use it.

ghost commented 6 years ago

@OvermindDL1 sorry to bother you, but can you maybe look into it? I'm not particularly familiar with cowboy. I'm using the same ranch options that plug is using, max_connections: 16_384, acceptors: 100.

ghost commented 6 years ago

So I've tried using cowboy 1.1 instead of 2.0-pre and testing with wrk yields better results, similar to those from testing plug. Can't make benchmark.cr work though.

OvermindDL1 commented 6 years ago

Hmm, I can't even get your cowboy test to run.

Testing it manually, and getting permission denied?

You are not setting the ./elixir/cowboy/bin/server_elixir_cowboy and ./elixir/elli/bin/server_elixir_elli as executable, how'd you pull that off? ^.^;

Fixed that, running it now, and the cowboy test is left running, need to close it out properly as it is currently not closing. Elli is doing the same thing, it is not being closed out either.

Results:

Language (Runtime)        Framework (Middleware)          Max [sec]       Min [sec]       Ave [sec]
------------------------- ------------------------- --------------- --------------- ---------------
elixir                    plug                             6.787050        6.494761        6.615891
elixir                    phoenix                          7.771644        7.502440        7.663781
elixir                    cowboy                          10.240735        8.533095        9.205473
elixir                    elli                             6.273602        5.790193        5.976398

Definitely need to close out both elli and cowboy properly, they are left running.

And I see the PR just updated, repulling. ^.^ Cowboy now hits:

elixir                    cowboy                           7.493882        5.927397        6.792573

Still not as good as Plug yet, which is odd since Plug is built on top of Cowboy, hmm.

I'd say post this to the ElixirForums to have other dissect it as well, my time is short at the moment sadly... :-(

And yep, wrk is a much better tester (siege maybe even more so?) than this crystal-based client, the crystal one is not saturating near as much as a good tester could. ^.^;

ghost commented 6 years ago

You are not setting the ./elixir/cowboy/bin/server_elixir_cowboy and ./elixir/elli/bin/server_elixir_elli as executable, how'd you pull that off? ^.^;

oh, don't know, sorry

the cowboy test is left running, need to close it out properly as it is currently not closing. Elli is doing the same thing, it is not being closed out either.

i didn't put them into a supervison tree, will fix that now, sorry again

And I see the PR just updated, repulling. ^.^ Cowboy now hits:

that's cowboy 1.1

OvermindDL1 commented 6 years ago

oh, don't know, sorry

Heh, just set the attribute on the file and git will grab it fine. :-) I'm curious how it was running for you though, it was not executable.

i didn't put them into a supervison tree, will fix that now, sorry again

Ah cool cool.

that's cowboy 1.1

Yep, I'm curious why cowboy 1.1 is faster than 2.0, 2.0 should have a lot of enhancements, hence if it is still slow when you finish adding it in (I see you adding both cowboy1 and cowboy2) then I'd post it on their issue tracker and the elixir forums both.

ghost commented 6 years ago

i didn't put them into a supervison tree, will fix that now

that didn't help

hence if it is still slow when you finish adding it in (I see you adding both cowboy1 and cowboy2) then I'd post it on their issue tracker and the elixir forums both

will look into what can be wrong with it sometime later, and probably post it on the forum first. Don't want to bother essen with synthetic benchmarks (he seems to hate them).

also, with wrk elli handles about twice (~40k req/s) as many requests as cowboy2 (~20k req/s) on my laptop, but in this benchmark it is hardly different from plug (which handles ~27k req/s)

OvermindDL1 commented 6 years ago

also, with wrk elli handles about twice (~40k) as many requests as cowboy2 (~20k) on my laptop, but in this benchmark it is hardly different from plug (which handles ~27k requests)

Yeah that is because the crystal client is capping out, it really needs to be ripped out, none of the 'fast' clients are accurate on the chart because of it.

ghost commented 6 years ago

a quick update, or not an update, since I haven't changed anything yet i've just tested elli, cowboy1, plug, phoenix, and cowboy2 with wrk and the results are

# elli
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   812.32us    1.19ms 119.88ms   96.50%
    Req/Sec     1.39k   268.29     2.41k    64.87%
  2483427 requests in 1.00m, 146.84MB read
Requests/sec:  41370.48
Transfer/sec:      2.45MB

# cowboy1
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.71ms   11.18ms 202.02ms   97.33%
    Req/Sec     0.98k   235.23     1.84k    77.06%
  1756141 requests in 1.00m, 152.72MB read
Requests/sec:  29232.42
Transfer/sec:      2.54MB

# plug
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.86ms    4.90ms 126.98ms   95.62%
    Req/Sec     0.88k   220.91     1.85k    71.25%
  1583883 requests in 1.00m, 216.00MB read
Requests/sec:  26362.86
Transfer/sec:      3.60MB

# phoenix
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.04ms    5.27ms 134.31ms   95.27%
    Req/Sec   842.51    224.66     1.93k    70.31%
  1510341 requests in 1.00m, 205.97MB read
Requests/sec:  25136.89
Transfer/sec:      3.43MB

# cowboy2
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.78ms    4.17ms 125.24ms   97.38%
    Req/Sec   796.19    139.68     1.45k    74.55%
  1427850 requests in 1.00m, 124.17MB read
Requests/sec:  23762.01
Transfer/sec:      2.07MB

The transfer/sec between elli and cowboy 1.1 are different because elli doesn't send Server: Cowboy and date headers. And no, adding date headers does not affect it's performance.

tbrand commented 6 years ago

Thanks for the big commits! I'll take a look at them, give me a little time!:pray:

ghost commented 6 years ago

increasing max_keepalive seems to yield somewhat lower latencies for cowboy1, not so much for cowboy2

# cowboy1
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.45ms    3.65ms 128.25ms   96.48%
    Req/Sec     1.01k   218.63     2.09k    72.75%
  1802528 requests in 1.00m, 156.43MB read
Requests/sec:  30003.77
Transfer/sec:      2.60MB

# cowboy2
> wrk -t30 -c40 -d60s http://localhost:3000
Running 1m test @ http://localhost:3000
  30 threads and 40 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.52ms    3.74ms 133.02ms   98.20%
    Req/Sec   832.90    132.89     1.52k    73.87%
  1493006 requests in 1.00m, 129.57MB read
Requests/sec:  24848.78
Transfer/sec:      2.16MB

ghost commented 6 years ago

@tbrand sure, thank you

@OvermindDL1 so everything seems quite alright with wrk, cowboy 1.1 is faster than plug, but not by much, which is expected, since in this benchmark plug just adds a few additional headers and calls :cowboy_req.reply. As for cowboy2, I think I will brace myself and write to essen about it later today. ~~One thing though, he will probably ask to translate the code into erlang, could you help with that?~~

ghost commented 6 years ago

So it seems that cowboy2 is a bit too early to test, I think I will remove it then.

OvermindDL1 commented 6 years ago

Very cool followup! So basically cowboy2 will be a tiny bit slower than cowboy1 because it adds a unified interface for http1.1 and http2, but in doing so it gains a lot of ease of use, plus http2. :-)

And of course cowboy2 is not tuned yet, still in dev. ^.^

I wonder if it might be interesting to keep it in with a dependency of using the latest master as it develops so we can see changes, hmm...

ghost commented 6 years ago

@OvermindDL1 don't want to bother you, but could you maybe re-run these benchmarks on your 6-core machine (I think i've seen you mention it in some other thread) sometime later? I'd like to see how elli and cowboy2 stack up.

OvermindDL1 commented 6 years ago

@OvermindDL1 don't want to bother you, but could you maybe re-run these benchmarks on your 6-core machine (I think i've seen you mention it in some other thread) sometime later? I'd like to see how elli and cowboy2 stack up.

Already am actually. Have you fixed the sessions not getting killed properly? I'll find out in a few minutes when compiling and testing finishes regardless. :-)

ghost commented 6 years ago

Have you fixed the sessions not getting killed properly?

I'm not sure, I mean, I've put the web servers into supervision trees, but killing them with ^C^C while they are running in foreground still doesn't always work for me, so I kill the beam

for i in `ps -ef | grep erl | awk '{print $2}'`; do echo $i; kill -9 $i; done

OvermindDL1 commented 6 years ago

I'm not sure, I mean, I've put the web servers into supervision trees, but killing them with ^C^C in foreground still doesn't always work for me, so I kill the beam

Yeah, some are not dying. They should not be in the foreground but should probably be daemonized via the start and stop commands on the release, that 'should' work (but will not work if it is foregrounded).

Also, kill -9'ing the process will not always work as it spawns other parts of itself as it forks, plus if daemonizing it may double-fork (I'm unsure there) so that would not work either to kill it all (same thing some of the other servers in this test-suite do).

Running them individually and killing beam between each attempt I get these (in order of fastest at top to slowest at bottom via Ave running with all 6-cores enabled):

Language (Runtime)        Framework (Middleware)          Max [sec]       Min [sec]       Ave [sec]
------------------------- ------------------------- --------------- --------------- ---------------
rust                      iron                             4.975547        4.899702        4.928051
elixir                    cowboy2                          5.589613        5.321457        5.439415
rust                      nickel                           6.169157        5.888196        6.021756
elixir                    cowboy1                          6.165670        5.772522        6.063751
elixir                    elli                             6.460629        5.832196        6.130952
elixir                    plug                             6.950836        6.531153        6.669494
rust                      rocket                           7.546116        6.836869        7.021902
elixir                    phoenix                          7.971764        7.573512        7.838155
crystal                   router_cr                        8.588450        8.352508        8.489685
crystal                   kemal                           10.994124       10.183791       10.656268

cowboy1, cowboy2, and elli are all left running after their tests.

Cowboy2 is actually faster than Rust's Nickel now!

(Sorry for the delay, got a bit busy for a bit.)

ghost commented 6 years ago

Yeah, some are not dying. They should not be in the foreground but should probably be daemonized via the start and stop commands on the release, that 'should' work (but will not work if it is foregrounded).

Oh, I've just copied plug and phoenix's makefiles/bin and they used foreground, will try to change that.

So elli is slow now. Makes me sad.

OvermindDL1 commented 6 years ago

Oh, I've just copied plug and phoenix's makefiles/bin and they used foreground, will try to change that.

Huh, weird, I'd not think those would work then when stopped, but eh...

OvermindDL1 commented 6 years ago

Ah I see why it is not working @idi-ot ! You did not update the ./benchmarker/src/benchmarker.cr file to 'stop' your new ones, so they are not getting stopped.

OvermindDL1 commented 6 years ago

In:

  def kill
    @process.not_nil!.kill

    # Since ruby's frameworks are running on puma, we have to kill the independent process
    if @target.lang == "ruby"
      kill_proc("puma")
    elsif @target.lang == "node"
      kill_proc("node")
    elsif @target.name == "plug"
      path = File.expand_path("../../../elixir/plug/_build/prod/rel/my_plug/bin/my_plug", __FILE__)
      Process.run("bash #{path} stop", shell: true)
    elsif @target.name == "phoenix"
      path = File.expand_path("../../../elixir/phoenix/_build/prod/rel/my_phoenix/bin/my_phoenix", __FILE__)
      Process.run("bash #{path} stop", shell: true)
    elsif @target.name == "akkahttp"
      kill_proc("akkahttp")
    elsif @target.name == "aspnetcore"
      kill_proc("dotnet")
    end
  end

You need to add entries for cowboy1/cowboy2/elli.

In reality a test for lang == "elixir" should probably be done and then just call the path out based on the @target.name, but that can be later. ^.^

ghost commented 6 years ago

You need to add entries for cowboy1/cowboy2/elli.

done

OvermindDL1 commented 6 years ago

That looks good. I can test it if you want?

ghost commented 6 years ago

I don't know, I think you've already run the tests? Or do you mean if killing the process works now? I don't care much about it, since I usually use start/stop and they work as expected.

OvermindDL1 commented 6 years ago

To test the stopping that is, and they all close properly now except cowboy2, running 'stop' on it never returns, that is weird...

Diagnosing...

It is not stopping because one of the supervisors is not stopping. That supervisor is in the application :my_cowboy.

ghost commented 6 years ago

The only supervisor there that might cause this is ranch, I think

defmodule MyCowboy.Application do
  @moduledoc false
  use Application

  def start(_type, _args) do
    nb_acceptors = 100
    trans_opts = [port: 3000]
    proto_opts = %{max_connections: 16_384,
                   max_keepalive: 5_000_000,
                   stream_handlers: [MyCowboy.StreamHandler]}

    children = [{{:ranch_listener_sup, MyRanch},
                 {:cowboy, :start_clear, [MyCowboy, nb_acceptors, trans_opts, proto_opts]},
                :permanent, :infinity, :supervisor, [:ranch_listener_sup]}]

    opts = [strategy: :one_for_one, name: __MODULE__.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

How did you find out in what application the supervisor is? observer?

OvermindDL1 commented 6 years ago

Yeah noticed, it is not responding to the stop request, probably one of the ranch or cowboy children, trying to find out...

OvermindDL1 commented 6 years ago

Yeah I'm unsure, maybe show this app to the Cowboy2 dev? Maybe it tries to keep itself running or so?

ghost commented 6 years ago

Don't know. Is the child spec correct, actually? I took it from https://github.com/VoiceLayer/plug_cowboy2/blob/master/lib/plug_cowboy2/adapter.ex#L153-L158

{{:ranch_listener_sup, MyRanch},
 {:cowboy, :start_clear, [MyCowboy, nb_acceptors, trans_opts, proto_opts]},
 :permanent, :infinity, :supervisor, [:ranch_listener_sup]}

Maybe it's possible to just run :cowboy.start_clear/4 in start/2 function here, and not put it under a supervisor?

ghost commented 6 years ago

Or maybe it should be like that https://github.com/potatosalad/plug/blob/cowboy2/lib/plug/adapters/cowboy.ex#L143-L150

OvermindDL1 commented 6 years ago

I've not used cowboy2 yet (still in dev after all) so I'm not entirely sure what it expects.

The second link you gave though is telling plug how to load it up rather then setting a dedicated handler, so I doubt that one. Maybe ask them? :-\

ghost commented 6 years ago

tried {:ok, _pid} = :cowboy.start_clear(:http, 100, trans_opts, proto_opts), doesn't seem to work either

tbrand commented 6 years ago

Will close this since the PR seems will not be updated. If you want to add them, please re-post PR after sync origin. Thanks

the-benchmarker / web-frameworks

add cowboy and elli #58