tarantool-php / client

PHP client for Tarantool.
MIT License
67 stars 22 forks source link

Tarantool-php client performances vs pdo/mysql #59

Closed jcheron closed 5 years ago

jcheron commented 5 years ago

Ubiquity-tarantool is almost operational, but there are certainly some adjustments to be made.

I performed some benchmarks using Techempower's multiple database queries test on my own servers, to appreciate the difference with pdo/mysql.

I performed benchmarks without Ubiquity, to appreciate the difference in reading between Tarantool and pdo/mysql.

The results are as follows:

Reading of 20 rows on a table of 10,000 rows (php is pdo/mysql):

image

The difference is even greater than with Ubiquity, which is partly explained by:

image

Tarantool

Server configuration

#!/usr/bin/env tarantool
-- Configure database
box.cfg {
   listen = 3302,
   background = true,
   log = '2.log',
   pid_file = '2.pid'
}
box.once("bootstrap", function()
   space = box.schema.space.create('world', {engine='vinyl'})
   space:format({
   {name = 'id', type = 'unsigned'},
   {name = 'randomNumber', type = 'string'}
   })
   box.schema.sequence.create('S',{min=1, start=1})
   space:create_index('primary', {
   type = 'tree',
   parts = {'id'},
   sequence= 'S'
   })
   box.schema.user.grant('guest', 'read,write,execute', 'space', 'world')
end)

Page source code:

\header('Content-type: application/json');
require_once 'vendor/autoload.php';

// Database connection
$client = \Tarantool\Client\Client::fromDsn ( 'tcp://127.0.0.1:3302' );

// Read number of queries to run from URL parameter
$query_count = 1;
if ($_GET['queries'] > 1) {
  $query_count = $_GET['queries'] > 500 ? 500 : $_GET['queries'];
}

// Create an array with the response string.
$arr = [];

// For each query, store the result set values in the response array
while (0 < $query_count--) {
  $query=$client->executeQuery('SELECT "id","randomNumber" FROM "world" WHERE "id" = ? limit 1', \mt_rand(1, 10000));
  // Store result in array.
  $arr[] = $query->getFirst();
}

// Use the PHP standard JSON encoder.
// http://www.php.net/manual/en/function.json-encode.php
echo \json_encode($arr);

pdo/mysql

Page source code:

\header('Content-type: application/json');

// Database connection
// http://www.php.net/manual/en/ref.pdo-mysql.php
$client = new \PDO('mysql:host=127.0.0.1;dbname=hello_world', 'userxxx', 'passwordxxx', [
    \PDO::ATTR_PERSISTENT => true,
    \PDO::ATTR_EMULATE_PREPARES => false
]);

// Read number of queries to run from URL parameter
$query_count = 1;
if ($_GET['queries'] > 1) {
  $query_count = $_GET['queries'] > 500 ? 500 : $_GET['queries'];
}

// Create an array with the response string.
$arr = [];

// Define query
$statement = $client->prepare('SELECT id,randomNumber FROM World WHERE id = ?');

// For each query, store the result set values in the response array
while (0 < $query_count--) {
  $statement->execute( [mt_rand(1, 10000)] );

  // Store result in array.
  $arr[] = $statement->fetch(PDO::FETCH_ASSOC);
}

// Use the PHP standard JSON encoder.
// http://www.php.net/manual/en/function.json-encode.php
echo \json_encode($arr);

Configuration for tests

rybakit commented 5 years ago

the absence of persistent connection, and prepared statements

Upcoming client release will have support for persistent connections, you can try it by installing the latest dev version: composer install tarantool/client:@dev

the Loading of 22 Tarantool-php/client files vs 1 for pdo

Doesn't matter if you use Opcache. Make sure you have it enabled and properly configured.

Apart from that I would recommend to use memtx instead of vynil (both engines are persistent, but memtx is faster).

Also note, that Tarantool's SQL implementation is still under active development, there are a lot of optimizations planned. In that regard I would suggest running your benchmark using the binary protocol and compare results with the SQL ones (in my tests SQL ~2 times slower than the binary equivalent).

Another option to inmprove the performance is to make calls asynchrouniusly with ext-async or ext-swoole (according to my benchmars, you can gain 3x boost in performance).

I will publish by benchmark results soon in this repo.

jcheron commented 5 years ago

Thank you for all these answers @rybakit

On this test and with my configuration:

But I can't use it with Ubiquity, since ORM is based on the relational model and SQL (the objective was to be able to switch from pdo/* to Tarantool in a transparent way, without the developer having to change his code).

Rq: I don't quite agree on the effects of the number of files included. OPCache reduces the effect of multiple inclusions, but does not completely annihilate it (I did a special php-22 test where I included with pdo the 21 additional files required by Tarantool). OPCache is well activated and configured on my test server.

Results

image

Ubiquity-tarantool results

So I did the benchmarks again in reading for Ubiquity, with memtx and persistent connection.

It's better:-) in particularly for throughput (+11%), more variable for execution time

image

I'll try with Swoole to see how it goes.

jcheron commented 5 years ago

Hi @rybakit Here are the results of the tests with Swoole: I always use the same tests (Techempower multiple writing and reading), however, the test server has changed, so do not compare these results with the previous ones, in absolute value.

All tests are performed with 20 consecutive queries.

In writing:

Tarantool was already very fast, it is even faster

image

In reading:

I made several tentatives with Tarantool, connection pool, small groups of asynchronous queries in a coroutine... These tests did not yield anything, I think that on this kind of tests (repeated single queries), you can't gain the 3X in performance..

So I came back to the simplest solution, Swoole + coroutine activation: which gives quite acceptable results with Tarantool, which is as fast as Swoole Coroutine Mysql. Don't forget that I only use SQL (and not the binary protocol).

image

Are the results of your benchmarks in this direction?

jcheron commented 5 years ago

I was a little pessimistic about the reading performance: With a connection pool, and execution of queries in 5 groups of 4, the results are much better. [edit] With a properly adjusted connection pool, I now have fully satisfactory results, and we have more than 3X on the execution time (2.36 -> 0.73 ms), you were right 👍 [/edit]

Reading bench (20 queries)

image

rybakit commented 5 years ago

@jcheron I have released a new version with minor optimizations and published benchmark results (up to 5x boost in those synthetic benchmarks).

Also, please note that unlike MySQL, Tarantool is single-threaded and to leverage more CPU cores it's recommended to run multiple Tarantool instances on the same server.

And the good news about the SQL performance is that the upcoming Tarantool version will have support for prepared statements.

jcheron commented 5 years ago

That's good news, thank you ! I'm going to look at this closely.