squeaky-pl / japronto

Screaming-fast Python 3.5+ HTTP toolkit integrated with pipelining HTTP server based on uvloop and picohttpparser.
MIT License
8.61k stars 581 forks source link

Benchmark is correct? #3

Closed agalera closed 7 years ago

agalera commented 7 years ago

I am doing tests since its benchmark seems to me too high, I am doing the benchmark and they give me very different results to those that appear in the image:

CPU: Intel (R) Core (TM) i7-6700 CPU @ 3.40 GHz

japronto

docker run williamyeh/wrk -c 200 -t 1 -d 4 http://172.17.0.2:8080/ Running 4s test @ http://172.17.0.2:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 0.85ms 307.50us 3.17ms 70.66% Req/Sec 227.29k 3.84k 232.01k 75.00% 904138 requests in 4.01s, 79.33MB read Requests/sec: 225703.67 Transfer/sec: 19.80MB

meinheld

docker run williamyeh/wrk -c 200 -t 1 -d 4 http://172.17.0.2:8080/ Running 4s test @ http://172.17.0.2:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 5.03ms 9.90ms 215.48ms 99.25% Req/Sec 46.38k 768.92 48.44k 75.00% 184537 requests in 4.01s, 31.15MB read Requests/sec: 46049.91 Transfer/sec: 7.77MB

squeaky-pl commented 7 years ago

Hi kianxineki you are benchmarking speed of non pipelined client (like a regular browser) which is different from what I benchmarked. To do pipelined benchmark you need to grab a lua script from misc/pipeline.lua. Then you can edit the script to get different number of pipelined requests. The benchmark from the article and README.md tests pipelined performance.

agalera commented 7 years ago

Still without request pipelining Japronto does 400,000 RPS on the same hardware.

The performance with meinheld is very similar to its graphics, I expected to have 400,000 rps on my server without a pipeline

squeaky-pl commented 7 years ago

Maybe try without docker then because it adds an extra layer of indirection between kernel and userspace. I didnt run the benchmarks inside docker. Also be sure to run them on Python 3.6 which gives some gains over 3.5

agalera commented 7 years ago

pipeline on:

japronto

docker run -v /tmp/wrk/scripts/pipeline.lua:/pipeline.lua williamyeh/wrk:4.0.2 -c 200 -t 1 -d 4 -s /pipeline.lua http://172.17.0.2:8080/ Running 4s test @ http://172.17.0.2:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 0.95ms 423.28us 4.10ms 66.73% Req/Sec 499.02k 5.61k 508.05k 77.50% 1986543 requests in 4.01s, 174.30MB read Requests/sec: 494990.77 Transfer/sec: 43.43MB

meinheld

docker run -v /tmp/wrk/scripts/pipeline.lua:/pipeline.lua williamyeh/wrk:4.0.2 -c 200 -t 1 -d 4 -s /pipeline.lua http://172.17.0.2:8080/ Running 4s test @ http://172.17.0.2:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 9.75ms 17.67ms 267.64ms 98.21% Req/Sec 58.58k 0.94k 60.54k 72.50% 233140 requests in 4.00s, 39.35MB read Requests/sec: 58218.57 Transfer/sec: 9.83MB

run tests in python 3.6

squeaky-pl commented 7 years ago

Are you actually running Japronto inside docker ro only wrk? Where did you get the binary release of Japronto from, did you compile them yourself?

agalera commented 7 years ago

japronto run in docker python:3.6

docker run -it python:3.6 bash pip install japronto git clone https://github.com/squeaky-pl/japronto.git python3 japronto/benchmarks/japronto/micro.py

squeaky-pl commented 7 years ago

Running benchmarks in docker is not a great idea.

You can follow this script for Ubuntu https://github.com/squeaky-pl/japronto/blob/master/misc/bootstrap.sh to setup the environment exactly the same way I did in the article. Also I benchmark with 100 connections for 2 seconds and 1 thread 10 times in a row and take a median as a final result. Doing a benchmark once will be a subject to noise. You also look at the standard deviation of the results. If it's greater than 5% in 10 runs then something is wrong with the machine.

agalera commented 7 years ago

Result is similar

without docker (python 3.6):

meinheld

./wrk -c 200 -t 1 -d 4 -s scripts/pipeline.lua http://localhost:8080/ Running 4s test @ http://localhost:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 8.35ms 19.08ms 258.92ms 98.07% Req/Sec 75.44k 1.42k 79.40k 70.00% 300155 requests in 4.01s, 50.67MB read Requests/sec: 74921.39 Transfer/sec: 12.65MB

japronto (pip)

./wrk -c 200 -t 1 -d 4 -s scripts/pipeline.lua http://localhost:8080/ Running 4s test @ http://localhost:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 0.93ms 386.70us 3.13ms 62.77% Req/Sec 534.89k 72.99k 639.61k 65.00% 2129517 requests in 4.00s, 186.84MB read Requests/sec: 531982.26 Transfer/sec: 46.68MB

japronto compiled

./wrk -c 200 -t 1 -d 4 -s scripts/pipeline.lua http://localhost:8080/ Running 4s test @ http://localhost:8080/ 1 threads and 200 connections Thread Stats Avg Stdev Max +/- Stdev Latency 724.19us 344.36us 2.49ms 59.60% Req/Sec 625.37k 37.85k 646.24k 95.00% 2488953 requests in 4.01s, 218.38MB read Requests/sec: 621353.60 Transfer/sec: 54.52MB

japronto compiled -c 100 -t 1 -d 2

./wrk -c 100 -t 1 -d 2 -s scripts/pipeline.lua http://localhost:8080/ Running 2s test @ http://localhost:8080/ 1 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 372.56us 173.83us 1.54ms 57.70% Req/Sec 605.64k 6.96k 617.27k 80.00% 1205796 requests in 2.00s, 105.79MB read Requests/sec: 602821.14 Transfer/sec: 52.89MB

squeaky-pl commented 7 years ago

I don't know what's happening then. I can start a machine tomorrow that is exactly like the machine I used for benchmarking and let you login to play with it.

agalera commented 7 years ago

@squeaky-pl nice, i testing japronto -c100 -t 1 (check my after comment)

Japronto is really fast!

Currently I use bottle (https://github.com/bottlepy/bottle) as a micro-framework, it has support for many servers, such as meinheld, bjoern, tornado ... I'm thinking of trying to adapt it to take advantage of its potential. what do you think?

Thx for fast responses 👍

squeaky-pl commented 7 years ago

For bottle to work on Japronto I would need to write a WSGI adapter for it. I would expect it to be faster than Meinheld but I dont know how much. Japronto uses Request and Response objects written in C, the router is also written in C. With WSGI it's no longer possible to use those parts written in C.

Edit

Ah it looks like bottle.py can work with non-WSGI backends as well. You can try to adapt it, I would gladly help.

agalera commented 7 years ago

I understand that performance will decrease. Since Japronto has all the basic functionalities, the only thing I see that will be out of being able to add to the wrapper over the requests as does the bottle with its plugins, thus easy to implement general functionalities.

squeaky-pl commented 7 years ago

Closing this in favor of https://github.com/squeaky-pl/japronto/issues/21 which has a nice graph for non-pipelined results.

codefather-labs commented 5 years ago

https://www.youtube.com/watch?v=0CPCMAcs3qo

i try make benchmarks:

5 sec 8 th 5000 con ~~30-60k

1 sec 8 threads 200 connections

140-150k per sec

=\