swoole / swoole-src

🚀 Coroutine-based concurrency library for PHP
https://www.swoole.com
Apache License 2.0
18.42k stars 3.16k forks source link

Ubiquity-Swoole performances with Coroutine Mysql on TechEmpower benchmarks #2855

Closed jcheron closed 4 years ago

jcheron commented 4 years ago

First of all, I would like to thank the Swoole team for their efforts, which gave php a real boost .👍 In TechEmpower benchmarks, the Ubiquity-Swoole pair is currently in the top 5 of the fullstack frameworks, using a Full ORM associated with Mysql.

Despite these good results, I wonder about the effectiveness of my use of the Coroutine Mysql component in Ubiquity. A few weeks ago, Ubiquity-swoole was still using a PDO Connection for its tests.

The PDO connection being blocking, I switched to Swoole Coroutine Mysql, hoping to benefit from the asynchronous aspect. The result was convincing for the multiple queries test...

multiple queries test

PDO

image

Coroutine Mysql

image

...but quite disappointing for the other tests (Fortunes, single query and especially Data updates):

Data updates test

PDO

image

Coroutine Mysql

image

The results are much better on this test with PDO than with Swoole Coroutine Mysql, regardless of the number of queries, and I can't explain why.

Data table with Ubiquity-swoole PDO

image

Infos about Ubiquity-Swoole implementation

The implementation of the Data updates test is simple (and similar to multiple queries test):

The Ubiquity-swoole Pool is composed of an instance of Channel.

Each operation on the database requires to retrieve the Mysql instance corresponding to the active coroutine (see getInstance method).

The Ubiquity-Swoole server has a default configuration (see SwooleServer::setOptions)

I have already tried to modify the test implementation by executing queries (read + update) by group, in different coroutines (with the go method: see update+go). The result is much better, in terms of execution time, at a fairly low concurrency level (10), but it is very bad in terms of throughput, with a concurrency level of 512.

If you see any errors, have any ideas for improvement or just want to make some comments, feel free to do so! The objective is to improve performance, using the full potential of Swoole.

twose commented 4 years ago

if you know runtime hook? https://github.com/swoole/swoole-src#-amazing-runtime-hooks It can make PDO asynchronous directly

image If your concurrency is high, the number of connections will be out of control

it shows the way to control the number of connections: https://github.com/swoole/swoole-src#the-simplest-example-of-a-connection-pool

jcheron commented 4 years ago

Thanks @twose Okay, for PDO, I'll try the go runtime hook.

And For The Swoole Mysql, it's actually smarter to create the 512 instances of Db at startup...

Did you see that the Swoole tests implementation on TechEmpower make the same mistake as me?

twose commented 4 years ago

the implementation of the Swoole test on TechEmpower is not the best way

jcheron commented 4 years ago

With the initialization of the Pool at instantiation, I am facing a problem that I didn't have before. Coroutine Mysql instance creations must be done in a coroutine And if I understood correctly, the connection pool (initialized) must be created in a main coroutine, and passed to the child coroutines that use it.

\Swoole\Runtime::enableCoroutine();
go(function () {
    $pool = new Pool(512);
        go(function () use ($pool) {
            // do something with $pool
        }
}

In my case, I would like to create the pool when starting the Http server (on the start event), and use it on each request events.

$server=new Server();
$server->on('start',function(){
    // pool creation ?
});

$server->on('request',function(Request $request, Response $response){
    // pool usage !
});
$server->start();

But I do not see where to place the Pool instantiation to satisfy these 2 constraints.

jcheron commented 4 years ago

It's okay, no need to nest the coroutines if I use the Event::wait method on pool instantiation.

$server=new Server();
$server->on('start',function(){
    \Swoole\Runtime::enableCoroutine();
    go(function () {
       initPool(512);
    });
    \Swoole\Event::wait();
});

$server->on('request',function(Request $request, Response $response){
    // pool usage
});
$server->start();

[edit] I think I shouted victory a little quickly. It passes over one request, but as soon as there are concurrent accesses, it crashes (the prepared statements returns false). It's weird that the code worked with the previous Pool version. Can it be a problem for different processes to access the code of the same object? Can there be a conflict even on local variables in a method? [/edit]

twose commented 4 years ago

@jcheron

you should put initPool in workerStart callback

and don't use \Swoole\Event::wait() anymore, it is not the style of coroutine

jcheron commented 4 years ago

Thanks @twose

This means that I don't have 1 pool, but as many pools as workers. And that the pools are for example stored in an array, indexed by the worker_id.

In this case, if I take a connection from the pool, I need to know the active worker_id. Is there a global method to get this active worker_id in the request event?

twose commented 4 years ago

Every worker has its own client connection pool because the resources of different processes are isolated

But whatever the worker num is, your code is written in the same way

jcheron commented 4 years ago

Long live isolation! I was scared, but it works, thank you!

For a planned concurrency level of 512, with x workers, so I have to create pools of 512/x, right?

It gives that when the server starts:

$swooleServer->on('workerStart',function($srv) use($config){
    \Ubiquity\orm\DAO::initPooling($config,'swoole',\intdiv(512,$srv->setting['worker_num']));
});

The modified pool:

abstract class AbstractConnectionPool{

    protected $pool;

    abstract protected function createDbInstance();

    abstract protected function setDbParams(&$dbConfig,$offset=null);

    public function __construct(&$config, $offset=null,int $capacity=16){
        $this->pool = new Channel($capacity);
        $this->setDbParams($config,$offset);
        while($capacity>0){
            $db=$this->createDbInstance();
            if($db!==false){
                $this->pool->push($db);
                $capacity--;
            }else{
                throw new \RuntimeException('failed to connect to DB server.');
            }
        }
    }

    public function put($db){
        $this->pool->push($db);
    }
    public function get(){
        return $this->pool->pop();
    }

For the moment, whatever the test (query, Fortunes, updates...), I have no significant difference with the previous version (ab -c 512 -t 3 http::///...) But running on TechEmpower servers can sometimes be surprising.

I'm going to switch to PDO (and the use of go)

jcheron commented 4 years ago

I think I'm wrong somewhere, the PDO + go result doesn't seem asynchronous, according to the results.

I start the database in the workerStart:

$swooleServer->on('workerStart', function ($serv, $worker_id) use ($config) {
    \Ubiquity\orm\DAO::startDatabase($config, 'default');
});

Implementation of the Multiple queries test:

public function query($queries = 1)
{
    \Swoole\Runtime::enableCoroutine();
    $queries = \min(\max($queries, 1), 500);
    $worlds = new Channel($queries);
    for ($i = 0; $i < $queries; ++ $i) {
        go(function () use ($worlds) {
            $worlds->push((DAO::getById(World::class, \mt_rand(1, 10000), false))->_rest);
        });
    }
    $r = [];
    for ($i = 0; $i < $queries; $i ++) {
        $r[] = $worlds->pop();
    }
    echo \json_encode($r);
}

Results

12 workers on i7-8750H CPU @ 2.20GHz ab -c 512 -t 3

Multiple queries x20

PDO without Coroutine

PDO + go

Data updates x20

The results are fluctuating, between 200 and 1000, but no significant difference between async or not.

PDO without Coroutine

PDO + go

twose commented 4 years ago

call \Swoole\Runtime::enableCoroutine() before your create PDO instance

Why not try a simple example in README?

you should do it step by step

jcheron commented 4 years ago

Sorry @twose I have already tried the simple examples of README. But these are simple examples in a standalone script.

Would you show me, through an example, how to use a connection to a database to make asynchronous queries in the request event of an http server, knowing that the initialization of the connection must be global, and not created on each request?

[edit] I'll try to do it with a small example, without Ubiquity. [/edit]

jcheron commented 4 years ago

This is what I get for the simple case (without Framework):

use Swoole\Http\Request;
use Swoole\Http\Response;

$server = new swoole_http_server('0.0.0.0', 8081, SWOOLE_BASE);
$server->set([
    'worker_num' => swoole_cpu_num()
]);

$database = new Database();

$server->on('workerStart', function () use ($database) {
    $database->connect();
});

$server->on('request', function (Request $req, Response $res) use ($database) {
    \Swoole\Runtime::enableCoroutine();
    $queries =20;
    $channel = new \Swoole\Coroutine\Channel($queries);
    go(function () use ($channel, $queries, $database) {
        $db = $database->get();
        for ($i = 0; $i < $queries; $i ++) {
            $db->st = $db->st ?? $db->prepare('SELECT * FROM World WHERE id = ?');
            $db->st->execute([
                \mt_rand(1, 10000)
            ]);
            $channel->push($db->st->fetch(PDO::FETCH_ASSOC));
        }
    });
    $arr = [];
    for ($i = 0; $i < $queries; $i ++) {
        $arr[] = $channel->pop();
    }
    $res->end(\json_encode($arr));
});

class Database {

    private $db;

    public function connect(){
        $this->db = new \PDO('mysql:host=127.0.0.1;dbname=test', 'root', '', []);
    }

    public function get(){
        return $this->db;
    }
}

It's still not asynchronous.

ab results:

ab -c 512 -t 3

If I place the enableCoroutine in the onWorkerStart, I have a whole bunch of queries that return false. I also tried it with a pool:

$server->on('workerStart', function () use ($pool, $database) {
    \Swoole\Runtime::enableCoroutine();
    $pool->init(intdiv(512, swoole_cpu_num()));
});

Queries executed in groups of 10 in 2 calls go => Finished 15500 requests Not very conclusive. It is however a very simple case.

twose commented 4 years ago

https://github.com/swoole/swoole-src/releases/tag/v4.4.13RC2

Built-in Connection Pool has been released