pypy3 - Githubissues

B1gG commented 3 years ago

Hi, is there any specific documentation I can check to run gunicorn with pypy3? I am doing the following and the perf is terrible:

virtualenv -p /opt/pypy3.7-v7.3.3-linux64/bin/pypy falcon3_pypy3
cd falcon3_pypy3/
pip install falcon
pip install gunicorn
gunicorn -b 0.0.0.0:8000 --workers=8 --threads=4 --worker-connections=1000 things:app

I am using the example https://github.com/falconry/falcon#getting-started The #/s don't go above 1k. When using CPython I get +3k

tilgovi commented 3 years ago

PyPy is designed to speed up CPU-bound workloads and will likely not be faster than CPython in synthetic benchmarks of I/O-bound workloads like a webserver echo test.

You are not doing anything wrong, but his is a result that is not unexpected. If you can produce a realistic usage scenario where PyPy is unreasonably slow, and you are willing to dive into debugging whether there is something Gunicorn is doing that interferes with the ability for PyPy to optimize it, it would be worth bringing it up here or with the PyPy project, but benchmarking a web framework hello world is not likely to fit the PyPy use case.

vytas7 commented 3 years ago

@B1gG IIRC we have already discussed this on Gitter (if that was the case, posting just for reference).

The low throughput is most probably caused by Gunicorn running synchronously in the above configuration, and waiting for multithreaded I/O on PyPy might be even worse than CPython.

PyPy does speed up Gunicorn a fair bit, but you need to use an asynchronous selector loop such as gevent.

Running a very minimal Falcon app as

gunicorn --processes 8 app:app

I'm getting roughly the same numbers as you:

Running 10s test @ http://localhost:8000
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.11ms    1.95ms  24.70ms   82.03%
    Req/Sec     1.30k     1.37k    5.15k    82.83%
  25793 requests in 10.05s, 3.96MB read
Requests/sec:   2566.70
Transfer/sec:    403.57KB

For comparison, running with gevent:

gunicorn --workers 8 --worker-class gevent --worker-connections 1024 app:app

Running 10s test @ http://localhost:8000
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.09ms   23.88ms 610.02ms   99.40%
    Req/Sec    22.47k    16.66k   46.28k    47.00%
  446887 requests in 10.00s, 70.75MB read
  Socket errors: connect 0, read 1, write 0, timeout 1
Requests/sec:  44667.56
Transfer/sec:      7.07MB

Which is AFAICT better than you could achieve using Gunicorn with CPython.

That said, @tilgovi's comment about synthetic benchmarks looking better on CPython still makes a lot of sense if you choose a CPython app server or server worker class written in C, such as Meinheld, which could get above 100k req/s on my machine. It's always best to benchmark a real application.

B1gG commented 3 years ago

Thanks @vytas7 yes understtod. I believe I posted this before our discussion on Gitter it was around March :)

benoitc / gunicorn

pypy3 #2538