Performance question - Githubissues

vlad-khramov commented 5 years ago

GINO version: 0.8.1
Python version: 3.6.7
asyncpg version: 0.18.2
PostgreSQL version: 11.1

Description

I've created simple app with sanic/gino. Several controllers which do 1-2 queies to db (select only).

I've made performance test ( the smallest droplet on Digital ocean, DB_POOL_MAX_SIZE=70). Controllers with db queries holds only about 100rps. Without - about 1000rps.

I know, that I can increase max connections of pg. Also I can disable DB_USE_CONNECTION_FOR_REQUEST or release shared connection from pool earlier.

What else could be done to increase rps? The fact, that I use only select queries could be helpful?

fantix commented 5 years ago

Thanks for the question and I'll get back to this a bit later this weekend!

First of all, you'd better run the benchmark script from a different machine than the server - it consumes CPU too. It is also a good idea to separate the database the same way. TL;DR: Python is that "slow", performance is possibly fair.

My benchmark is as follows:

GINO version: master
Python version: 3.7.0
asyncpg version: 0.18.2
PostgreSQL version: 11.1

Server script for the test:

import sys
from sanic import Sanic
from sanic.response import json
from gino.ext.sanic import Gino
import asyncpg

DB_DSN = 'postgresql://localhost/gino'
POOL_SIZE = 10

app = Sanic(configure_logging=False)

if __name__ == '__main__':
    if len(sys.argv) > 1 and sys.argv[1] == 'asyncpg':
        print('using asyncpg')
        pool = None

        @app.listener('before_server_start')
        async def before_server_start(_, loop):
            global pool
            pool = await asyncpg.create_pool(DB_DSN, max_size=POOL_SIZE)

        @app.route('/')
        async def test(request):
            await pool.fetchval('SELECT now()')
            return json({'hello': 'world'})

    elif len(sys.argv) > 1 and sys.argv[1] == 'gino':
        print('using gino')
        app.config.DB_DSN = DB_DSN
        app.config.DB_POOL_MAX_SIZE = POOL_SIZE
        db = Gino(app)

        @app.route('/')
        async def test(request):
            await db.scalar('SELECT now()')
            return json({'hello': 'world'})

    else:
        print('no database query')

        @app.route('/')
        async def test(request):
            return json({'hello': 'world'})

    app.run(host='0.0.0.0', port=8000, workers=1)

wrk -t 1 -c 32 http://localhost:8000 Vanilla Sanic: 10krps Sanic with asyncpg: 3krps Sanic with GINO: 2krps

Doubling the POOL_SIZE produces slightly better results, but not significant enough. Tweaking the wrk number of threads and connections makes no big difference too. Watching the metrics, it seems that the Sanic server contributes almost 100% CPU usage for all three cases, while the PostgreSQL processes are still quite idle. Given that, I believe the bottleneck in this case is CPU overhead.

I could try to find the most time-consuming operations in asyncpg and GINO and try to improve them, but I don't think there is a single place which takes a lot of computing resources once for all. Because the CPU time of each request is 0.1ms for vanilla sanic, 0.33ms with asyncpg, and 0.5ms with GINO, that is 0.23ms overhead on asyncpg, and 0.17ms overhead on GINO, which is approximately the number of lines of Python code that is executed correspondingly. In this case, adding more CPUs would increase the throughput, while keeping the same number of total database connections.

If using only select queries, it is also possible to increase throughput by:

properly caching the results in memory (local, memcached or Redis), or
once if the bottleneck is on the database server side, you can add read-only DB replicas.

fantix commented 5 years ago

Re:

I know, that I can increase max connections of pg.

Yes you can do that. But please be aware that, when increasing the connection pool size, you hit the CPU bottleneck soon. Adding more DB connections after that balance point won't increase throughput any further - it will eventually slow down the total process time (latency) of each request.

Also I can disable DB_USE_CONNECTION_FOR_REQUEST

Yeah that'll save some time, but limited I think. The lazy acquire won't block even when the DB pool is exhausted.

or release shared connection from pool earlier.

Yes this allows other coroutines to use the released connection earlier. If the releasing coroutine has a lot more I/O jobs to do after that, it may improve the overall throughput for sure.

vlad-khramov commented 5 years ago

Thanks a lot for your answer!

Yes, I've used benchmark script (yandex.tank) from another server. And I've also seen that cpu consuption of python script about 100% and posgres about 1

By the way, on which server have you run your benchmark? Laptop/aws etc?

wwwjfy commented 5 years ago

This is a general performance question. It may or may not be related to Gino, Sanic and/or Database.

Besides Fantix's notes on Gino itself, I'd also take a step back to see the necessity of optimizations and how to get the best result with minimum effort. If I'm in this situation, I'd do the following:

Do I need to optimise?
- What APIs are slow? Are they important to users? (Yes, every API is important. What I mean is that does users need them immediately or can slowness within certain range be tolerated?)
- Is it slow for few users or all of them? Do I want to spend time/money on 5 users on the other side of earth?
- Is it because of network conditions? There can be many reasons: server-side/client-side, bandwidth/network delay, etc.
Profile and identify the bottleneck. With more details than API level benchmark, a tracing system like Zipkin can help to find if the problem lies on database, web framework or business logic.
Monitor db pool usage
If it's caused by Python servers (context switch, logic handling, etc), adding more servers or upsize the current servers are always the easiest way, unless it's unreasonably inefficient.
If it's caused by postgresql taking too much time, check slow queries. Do we need to add indexes or optimize SQL statements?

I only list a very small portion of possibilities, and each bullet point can go a long way. Itself is too big a topic. :)

From technical perspective, I'd love to discuss about how to improve Gino performance; from business perspective, without more specs, focusing on Gino itself may not give us as many benefits as other optimisations.

fantix commented 5 years ago

on which server have you run your benchmark? Laptop/aws etc?

It was a dual-core 2.2GHz i7 on my MacBook Air (Early 2016).

wwwjfy commented 5 years ago

Closing due to inactivity

python-gino / gino

Performance question #420

Description