Performance degradation on production

4n70w4 commented 5 years ago

PHP 7.3 / Symfony 3.4

Hi! On local dev server i get x 2-10-20 request time speed up for 1 concurrency request.

But on production for 50 concurent request performance on FPM 180 requests per second, on RR 165 rps. If add gc_collect_cycles() for each cycle - perfomance for RR down to 140 rps...

It is strange, there are ideas why it can work slower than FPM?

ab -n 10000 -k -c 50 ...

Enable/disablee keep alive; direct connect to roadrunner port or nginx proxypass do not affect performance.

config:

http:
  address: 0.0.0.0:8080
  workers:
    command: "php /var/www/app/current/worker.php"
  pool:
    numWorkers: 8
    maxJobs: 0

bin/rr serve -c roadrunner.prod.yaml

With numWorkers: 50 and maxJobs: 200 the same perfomance.

Processor: AMD Ryzen Pro 1700x (8 x 3,4 GHz) RAM: 32 GB DDR4

4n70w4 commented 5 years ago

This seems to be due to "slow" connections to the database. But why is the performance with RR lower, and not the same as that of FPM?

wolfy-j commented 5 years ago

Hi, gc_collect_cycles is pretty expensive operation FYI, you don’t need it in every script. The performance is lower most likely due your script has a lot of blocking code and concurrency level is higher than number of availble workers. Check your CPU load time. Since your bottleneck is connections to DB and not CPU try to bump number of workers x2-5 to avoid building request pipeline.

wolfy-j commented 5 years ago

I will correct Wiki to accommodate your usecase.

wolfy-j commented 5 years ago

It will also be solved automatically once #97 is implemented.

wolfy-j commented 5 years ago

Oh, I see you mention numWorkers: 50

What integration script do you use? Do you see high timings in rr debug log or server is slow on itself?

wolfy-j commented 5 years ago

Can you try to run numWorkers: 50 without maxJobs? Also try another benchmark tool like siege, it will give better picture about what is average/medium request time.

wolfy-j commented 5 years ago

Closed due to no activity.

4n70w4 commented 5 years ago

@wolfy-j do you expect activity from me? I suspended the research because roadrunner is worse than fpm + nginx.

wolfy-j commented 5 years ago

I have requested clarification from you. The issue most likely is application specific. Unfortunately, I'm unable to debug it without a reliable way to reproduce it.

4n70w4 commented 5 years ago

@wolfy-j you're right. But I cannot show a commercial application having a problem. Making a special fake application to demonstrate a problem is too much work. The employer will not pay for it.

Perhaps a problem with some kind of PHP-module or open-source library. Now I can only provide this data.

yum install -y ImageMagick bc php php-fpm php-opcache php-apcu php-xml php-mbstring php-pdo php-pdo_mysql php-zip php-bcmath php-gd php-posix php-intl php-imap php-event php-pecl-redis4-4.1.1-1.el7.remi.7.2 php-gender php-xdiff php-imagick

and newrelic agent

    "require": {
        "ext-redis": "*",
        "ext-apcu": "*",
        "ext-imap": "*",
        "ext-imagick": "*",
        "campo/random-user-agent": "^1.2",
        "doctrine/doctrine-bundle": "^1.6",
        "doctrine/orm": "^2.5",
        "guzzlehttp/guzzle": "^6.3",
        "incenteev/composer-parameter-handler": "^2.0",
        "patrickschur/language-detection": "^3.2",
        "sensio/distribution-bundle": "^5.0.19",
        "swagger/server-bundle": "dev-master",
        "symfony/monolog-bundle": "^3.1.0",
        "symfony/polyfill-apcu": "^1.0",
        "jms/serializer": "^1",
        "symfony/swiftmailer-bundle": "^2.6.4",
        "symfony/symfony": "3.4.*",
        "twig/twig": "^1.0||^2.0",
        "hashids/hashids": "^3.0",
        "phpmailer/phpmailer": "^6.0",
        "indigophp/doctrine-annotation-autoload": "^0.1.0",
        "intervention/image": "^2.4",
        "namshi/cuzzle": "^2.0",
        "nikic/php-parser": "4.1.1",
        "roave/security-advisories": "dev-master",
        "4n70w4/fmwconcepts-imagemagicktools": "dev-master",
        "mikehaertl/php-shellcommand": "^1.4",
        "jms/cg": "dev-master",
        "jms/aop-bundle": "dev-master",
        "jms/di-extra-bundle": "^1.9"
    },

wolfy-j commented 5 years ago

Ok, thank you for the context. I'll check if there is anything which might be red flag.

OO00O0O commented 5 years ago

Yeah I wonder about redis.

wolfy-j commented 5 years ago

Well, if it's slower - something takes a too long time to be destructed/disconnected. Nginx ignored destruction time, RR does not. The question why it's OK on dev machine. Can be doctrine?

4n70w4 commented 5 years ago

I found project https://github.com/mrsuh/php-load-test Maybe it will help to simplify comprehensive performance testing.

4n70w4 commented 5 years ago

Based on the results of those tests if include $kernel->reboot(null) in the worker then roadrunner gets slower than nginx + php-fpm.

But I could not run these tests on my servers.

You also do kernel reboot in the example: https://github.com/spiral/roadrunner/wiki/Symfony-Framework

wolfy-j commented 5 years ago

Reboot is required for symfony as it not fully support long-running mode (at least not for every appplication). But per our logic reboot should not be slower than full initialization. It’s almost as if reboot too expensive. But why?

We run all our applications without reboot, but we had to create our own framework for that.

zmitic commented 5 years ago

@wolfy-j Actually, this is not correct anymore. From last year, Symfony is supporting long-running mode without memory leaks via kernel.reset tag. 3rd party bundles; hard to say.

But for some time, and still unknown reasons, I get insane amount of memory used on blank project. When I played with swoole recently, memory leaks happened during conversion to Symfony request: https://github.com/k911/swoole-bundle/issues/30

If interested, tomorrow I can start RR test on blank Symfony project and report you the details. It is the same problem; memory went wild during conversion. Code I used for RR test:

$kernel = new Kernel($env, $debug);
$relay = new StreamRelay(STDIN, STDOUT);
$psr7 = new PSR7Client(new Worker($relay));
$httpFoundationFactory = new HttpFoundationFactory();
$diactorosFactory = new DiactorosFactory();
$kernel->boot();
while ($req = $psr7->acceptRequest()) {
    try {
        $request = $httpFoundationFactory->createRequest($req);
        $response = $kernel->handle($request);

        $psr7->respond($diactorosFactory->createResponse($response));
        $kernel->terminate($request, $response);
//        $kernel->reboot(null);
    } catch (\Throwable $e) {
        $psr7->getWorker()->error((string)$e);
    }
}

wolfy-j commented 5 years ago

I’m very curious to see what you are going to find.

roadrunner-server / roadrunner

Performance degradation on production #132