valum-framework / valum

Web micro-framework written in Vala
https://valum-framework.readthedocs.io/en/latest/
GNU Lesser General Public License v3.0
225 stars 23 forks source link

Is there any benchmark of valum? #194

Closed choleraehyq closed 7 years ago

choleraehyq commented 7 years ago

I'm curious about the performance of valum.

arteymix commented 7 years ago

It's pretty fast, but could be a lot faster:

It's more a toy framework that focus on features and correctness for now.

Of course, all the CPU-intensive work (e.g. video decoding with GStreamer) can be seamlessly integrated within a Web application in Vala. There's serious interest there, especially if you need to work with existing C libraries.

There's an in-progress implementation of the TechEmpower Framework Benchmark in the organization. I (or anyone else) could finish and submit it once the 0.3.0 is out.

arteymix commented 7 years ago

Here's some samples! The following command were run:

wrk --latency --threads 12 --connections 128 http://localhost:<port>/

I have a

Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz

Valum

Running 10s test @ http://localhost:3003/
  12 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    31.33ms  139.19ms   1.71s    97.42%
    Req/Sec   669.07    383.08     3.16k    78.42%
  Latency Distribution
     50%   12.65ms
     75%   14.57ms
     90%   15.79ms
     99%  824.30ms
  74132 requests in 10.10s, 9.12MB read
  Socket errors: connect 0, read 0, write 0, timeout 22
Requests/sec:   7339.99
Transfer/sec:      0.90MB

Iron (rust)

Running 10s test @ http://localhost:3000/
  12 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.82ms    4.17ms  89.67ms   69.35%
    Req/Sec     1.67k   707.05     2.81k    67.25%
  Latency Distribution
     50%    4.77ms
     75%    5.99ms
     90%    8.87ms
     99%   21.40ms
  67005 requests in 10.10s, 7.28MB read
Requests/sec:   6635.12
Transfer/sec:    738.68KB

Node.js

Running 10s test @ http://localhost:8000/
  12 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.29ms    2.84ms  65.48ms   94.70%
    Req/Sec     1.09k   261.40     6.29k    96.84%
  Latency Distribution
     50%    8.58ms
     75%    9.93ms
     90%   11.28ms
     99%   15.28ms
  130927 requests in 10.10s, 19.48MB read
Requests/sec:  12963.08
Transfer/sec:      1.93MB

You can see that scalable polling makes a massive difference in the case of Node.js.

Iron has better latency because it dispatches requests in light threads, but the GLib main loop provide better concurrency even with a single-threaded model.

Moreover, it's not like libsoup-2.4 is heavily optimized. In real-world, you would probably use something like SCGI, which I think of rewritting upon GThreadedSocketService.

Hope that answers your questions!

arteymix commented 7 years ago

Oh, by the way you can also fork the application to distribute the load on multiple cores.

In this case we have --forks=4:

Running 10s test @ http://localhost:3003/
  12 threads and 128 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    22.84ms  120.39ms   1.68s    97.84%
    Req/Sec     1.11k   524.27     5.75k    80.12%
  Latency Distribution
     50%    8.08ms
     75%    8.81ms
     90%   10.16ms
     99%  656.09ms
  129145 requests in 10.10s, 15.89MB read
  Socket errors: connect 0, read 0, write 0, timeout 22
Requests/sec:  12787.12
Transfer/sec:      1.57MB 

I managed to get around 100k req/sec with 64 forks on the cluster at work ;)

choleraehyq commented 7 years ago

That helps me much! Thanks.