Very low performance without worker mode in docker

What happened?

I did a few performance comparisons in requests per second with a basic Laravel skeleton. I tested Nginx + PHP-FPM, FrankenPHP worker(octane), Swoole(octane), and FrankenPHP(not worker mode).

FrankenPHP worker mode is 4,59 % faster than Nginx + PHP-FPM, so very similar, while the performance for static files is 69,57% lower.

FrankenPHP without worker mode is very slow at just ~1 % of the RPS I get with Nginx + PHP-FPM. Despite the low performance, it is still using many cores, so it is spending time doing something.

I would expect Laravel Octane with FrankenPHP to be closer to Swoole, and I would expect the non worker docker run to be much faster.

Additionally I tried embedding the Laravel app following the steps in the docs. Performance was slightly better but still terrible, and when the benchmark was done it crashed with a segfault.


Stack RPS Percentage Difference
Nginx + PHP-FPM 14 639,20 -39,93 %
FrankenPHP Worker 15 425,75 -36,71 %
Swoole 24 372,82 0,00 % (Baseline)
FrankenPHP, non-worker 281,77 -98,84 %
Stack Static RPS Percentage Difference
Nginx + PHP-FPM 403 230 0.00 % (Baseline)
FrankenPHP Worker 122 699 -39,57 %
Swoole 173 383 -57,00 %
FrankenPHP, non-worker 34 173 -91,52 %


The non-worker mode test of FrankenPHP used docker with host networking to reduce overhead, using the same project as Nginx + PHP-FPM: docker run --net=host -v $PWD:/app dunglas/frankenphp

FrankenPHP does redirect to HTTPS, but the tests are done with Keep-Alive, so it should perform much better despite it being HTTPS.

Commands for PHP benchmarking: wrk -c 100 -t 1 -d 10s http://localhost/api/hi For static file serving: wrk -c 100 -t 8 -d 10s http://localhost/ok.txt

Nginx + PHP-FPM: (static 403 230 RPS)
Running 10s test @ http://localhost/api/hi
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.77ms    1.04ms  21.07ms   73.84%
    Req/Sec    14.72k   103.88    14.99k    72.00%
  146637 requests in 10.02s, 35.94MB read
Requests/sec:  14639.20
Transfer/sec:      3.59MB

FrankenPHP Worker Mode: (static 122 699 RPS)
Running 10s test @ http://localhost/api/hi
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.58ms    3.59ms  40.40ms   74.04%
    Req/Sec    15.50k   481.75    16.67k    77.00%
  154345 requests in 10.01s, 32.82MB read
Requests/sec:  15425.75
Transfer/sec:      3.28MB

Swoole: (static 173 383 RPS)
Running 10s test @ http://localhost/api/hi
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.87ms    1.38ms  36.20ms   77.79%
    Req/Sec    24.51k   538.22    25.40k    88.00%
  244397 requests in 10.03s, 54.77MB read
Requests/sec:  24372.82
Transfer/sec:      5.46MB

FrankenPHP, non worker: (static 34 173 RPS)
Running 10s test @ https://localhost/api/hi
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   347.06ms   32.71ms 398.46ms   94.90%
    Req/Sec   284.33     32.32   323.00     73.74%
  2823 requests in 10.02s, 702.99KB read
Requests/sec:    281.77
Transfer/sec:     70.17KB

System info

Ubuntu 24.04 Ryzen 9 3950X Linux spark 6.8.0-31-generic #31-Ubuntu SMP PREEMPT_DYNAMIC Sat Apr 20 00:40:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux Docker version 26.1.2, build 211e74b

How to reproduce FrankenPHP non-worker results

composer create-project laravel/laravel laravel-test
cd laravel-test/
php artisan install:api
echo -e "<?php\n\nuse Illuminate\Support\Facades\Route;\n\nRoute::get('/hi', function () {return 'Hi';});" > routes/api.php
php artisan optimize
docker run --net=host -v $PWD:/app dunglas/frankenphp

Build Type

Docker (Debian Bookworm)

Worker Mode


Operating System


CPU Architecture


dunglas commented 3 months ago

Your config doesn't seem to be optimal at all for benchmarking (Docker Volume, dev mode of Laravel, all features enabled while it's not the case for other tools).

That being said, we cannot review all benchmarks. We maintain one in this repository (https://github.com/dunglas/frankenphp/blob/main/testdata/load-test.js / https://github.com/dunglas/frankenphp/blob/main/testdata/benchmark.Caddyfile) and one for Symfony (https://github.com/dunglas/frankenphp-demo/tree/main/benchmark).

There is also the TechEmpower benchmark (https://github.com/TechEmpower/FrameworkBenchmarks/tree/d9bc7017125517e6ad95de2f5518e9e1361b8bb4/frameworks/PHP/laravel) that uses Laravel. Note that we didn't carefully review it and that it can probably be optimized (for instance, by disabling logs, which are known and expected to slow down benchmarks).

You may also be interested in these issues: https://github.com/dunglas/frankenphp/discussions/72

Feel free to reopen if you can provide a reproducer after having applied these optimizations.

withinboredom commented 3 months ago

Just to maybe point you in the right direction @alexskra, nginx spawns some 40 threads on a production machine just idling, while spawning much much more when under load. Further, most people set FPM to "ondemand" which will just keep spawning workers until whatever the max workers is.

I don't know your physical characteristics, but maybe try setting the number of threads to a comparable amount to nginx/fpm for FrankenPHP. By default, FrankenPHP only spawns 2 x number cores PHP workers. This is possibly much much lower than the FPM/nginx configuration.

You probably also need to tune GOMAXPROCS as well.

95% of people won't need to tune this to handle this level of RPS, unless you are at Google level, in which case there are probably other concerns. The bottlenecks of most RPS-driven workloads are in the database, and not the webserver.

dunglas commented 3 months ago

@withinboredom it would be awesome if you could write a "performance tuning" doc at some point, you have accumulated an extraordinary amount of knowledge about this topic!

dunglas commented 3 months ago

Maybe should we try to improve the TechEmpower benchmark results too.

alexskra commented 3 months ago

@dunglas All tests are done with Laravel APP_ENV set to production, and APP_DEBUG set to false. I'm on Linux, so bind mounts should have no impact as far as I know. As for optimizations, all are out of the box setups, with the exception of PHP-FPM where the only change is the number of processes to utilize (~all cores). As for the enabled features, I do not know what you mean by that. Does FrankenPHP have some features enabled that should be disabled in prod? Then it should probably be added to the "deploy in production" part of the docs.

I also tried using the Dockerfile from the Deploy in production part of the docs, and it's still very slow.

FROM dunglas/frankenphp

# Be sure to replace "your-domain-name.example.com" by your domain name
#ENV SERVER_NAME=your-domain-name.example.com
# If you want to disable HTTPS, use this value instead:

# Enable PHP production settings
RUN mv "$PHP_INI_DIR/php.ini-production" "$PHP_INI_DIR/php.ini"

# Copy the PHP files of your project in the public directory
#COPY . /app/public
# If you use Symfony or Laravel, you need to copy the whole project instead:
COPY . /app

95% of people won't need to tune this to handle this level of RPS, unless you are at Google level, in which case there are probably other concerns. The bottlenecks of most RPS-driven workloads are in the database, and not the webserver.

@withinboredom It is common for the DB to be the bottleneck, yes, but that is not true in this case where it's less than 300 RPS while eating up nearly all cores. There must be some extreme inefficiencies involved as PHP-FPM manages ~52x the number of requests with a similar amount of CPU usage.

The docker image doesn't crash or stop answering like the binary, but considering the extreme performance difference there might something very wrong when using the docker image too.

If someone could provide a basic non-worker setup for production(docker or using the binary) then I'm happy to try it and report back if it changes anything. I love the idea of FrankenPHP, but it seems like the non worker mode is very far from production ready.

dunglas commented 3 months ago

Does FrankenPHP have some features enabled that should be disabled in prod?

It is exposed publicly on the internet, no. If it is behind a reverse proxy or an ingress, you likely want to disable HTTPS, compression, etc.

There must be some extreme inefficiencies involved as PHP-FPM manages ~52x the number of requests with a similar amount of CPU usage.

The thing is that this can be caused by many many things (bad configuration, hardware, software, specific code path...). It will be very hard to diagnose without a reproducer or an environment that we can manipulate.

Our own (publicly available) benchmarks and most third-party benchmarks give totally different numbers, and it's almost impossible to tell why it behaves differently for you without more information.

withinboredom commented 3 months ago

Good idea!

The TechEmpower benchmarks might be a good place to start so we'd have a documentation of each effect.

withinboredom commented 3 months ago


There must be some extreme inefficiencies involved as PHP-FPM manages ~52x the number of requests with a similar amount of CPU usage.

There's no inefficiencies, Caddy is "queueing" the requests until a php thread can process the request. If you don't have enough PHP threads then it will just sit there, mostly enqueueing requests. You can see this in the exceptionally large latency.

alexskra commented 3 months ago

Our own (publicly available) benchmarks and most third-party benchmarks give totally different numbers, and it's almost impossible to tell why it behaves differently for you without more information.

I'm sorry, but where are the public benchmarks that shows better performance for non-worker mode FrankenPHP?

You were referring to TechEmpower benchmarks earlier, and they show the opposite of what you're saying. Take a look here.

Here is a summary for the linked TechEmpower results.


Swoole: 149 004 RPS Nginx + PHP-FPM: 32 960 RPS FrankenPHP: 3 814 RPS


Swoole is 39 times faster. Nginx + PHP-FPM is 9 times faster.

Huge differende.

Plain PHP

Swoole: 836 680 RPS Nginx + PHP-FPM: 177 872 RPS FrankenPHP: 365 RPS


Swoole is 2 292 times faster. Nginx + PHP-FPM is 487 times faster.

So, no, it's not just for me.

withinboredom commented 3 months ago

So, fun story (running the benchmarks right now), they're so low on TechEmpower because the benchmarks are actually silently failing due to port exhaustion.

It helps to set uname -n to a high number...

alexskra commented 3 months ago

There must be some extreme inefficiencies involved as PHP-FPM manages ~52x the number of requests with a similar amount of CPU usage.

There's no inefficiencies, Caddy is "queueing" the requests until a php thread can process the request. If you don't have enough PHP threads then it will just sit there, mostly enqueueing requests. You can see this in the exceptionally large latency.

That does not make sense to me. If it is just sitting there waiting for PHP threads then it should not consume all of the CPU, right? What is it doing with all that CPU time if it's just waiting? Queuing up to 100 requests isn't exactly a heavy workload at such low RPS. It has to queue less than 300 requests per seconds.

If I use Nginx + a single PHP-FPM process, then the CPU load will stay low as it's not really doing anything. It's just waiting for PHP-FPM to finish the request, and that process takes one CPU thread. Latency will be high, but CPU usage will be very low. Even with a single PHP-FPM process it's ~3 times faster than what I manage with FrankenPHP.

dunglas commented 3 months ago

@alexskra this page is too old, it was slow because logs were enabled and verbose, fixed in this PR: https://github.com/TechEmpower/FrameworkBenchmarks/pull/9047

Here is a newer result (but again, the benchmark hasn't been audited by us): https://www.techempower.com/benchmarks/#section=test&runid=d3364379-1bf7-465f-bcb1-e9c65b4840f9&hw=ph&test=fortune&f=zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik073-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-b8jj

withinboredom commented 3 months ago

That does not make sense to me. If it is just sitting there waiting for PHP threads then it should not consume all of the CPU, right?

You can enable profiling and I, for one, would be interested to see what comes out of it.

withinboredom commented 3 months ago

So, there does seem to be something up with non-worker mode. With worker mode, using this worker on the TechEmpower benchmarks:


$fn = static function(): void {
    //error_log(__DIR__ . $_SERVER['REQUEST_URI'] ?? "no file");

    $globals = $GLOBALS;

    if(file_exists(__DIR__ . $_SERVER['REQUEST_URI'])) {
        include __DIR__ . $_SERVER['REQUEST_URI'];

        foreach($globals as $name => $value) {
            $GLOBALS[$name] = $value;

        $diff = array_diff(array_keys($globals), array_keys($GLOBALS));
        foreach($diff as $name) {


error_log("Starting worker!");

while (frankenphp_handle_request($fn)) {}


I see ~16k RPS, which is in line with caddy + fpm (almost exactly, in fact). With worker mode disabled, I see ~2k RPS. Compared to nginx + fpm (40k RPS), it is still far behind. The fact that it is in line with caddy + fpm, this leads me to believe the bottleneck is caddy there, not Frankenphp.

That being said, there's probably some improvements to be made with non-worker mode.

withinboredom commented 3 months ago

Oh man. These TechEmpower benchmarks are shit. They reuse connections instead of not reusing connections so really this is benchmarking how fast a server can turn-around a socket. Naturally, nginx is going to be really good at this, however, these are useless benchmarks.

This is like measuring how much 512 users can slam your server, or maybe useful if there is a loadbalancer in front of your server ... in other words for things like caddy, absolutely useless because caddy is designed to run on the edge, where you want to know how many users you can handle slamming your server, not how hard 512 users can hammer your server.

So ... if we don't reuse connections ... the numbers look quite a bit different:

worker mode (tuned to machine): 13,308/s (79% CPU usage) worker mode (512 workers like fpm): 13,008/s (80% CPU usage) CGI mode (64 threads): 1881/s (40% CPU usage) caddy + fpm: 11,548/s (84% CPU usage) nginx + fpm: 14,541/s (82% CPU usage)

As you can see, in more "real-world" scenarios, there is almost no difference between any of them; or rather the difference is unlikely to be felt in any serious application.


"tuned to machine": num_workers=4x CPU cores, GOGC=200, GOMAXPROCS=512

edit: clarity

joanhey commented 2 months ago

The first problem that I see, is that the @alexskra bench don't use OPCache. This is a small problem in worker mode (swoole, Frankephp in worker mode, Workerman, ngx-php, ...) but it have a very big penalty in all not workers mode (Frankenphp CGI, fpm, ...).

joanhey commented 2 months ago

@withinboredom when you say "reuse connections" I think that you are talking about TCP connections.

In the "real-world" we use HTTP 1.1, 2 or 3. And by default all use HTTP persistent connections. Only HTTP 1.0 connections should always be closed, but in late 1995 (29 years ago) was added "keep-alive" to not close the connection.

If you want to test with HTTP 1.0 without keep-alive, you can use the old ab from Apache to benchmark. But you will be testing the TCP performance not the frameworks.

The Techempower benchmark also use persistent connection to the databases in the normal php-fpm. When I change it, we had also some reactions against it, but it's completely correct and the numbers reflect it. And the community understand it. Even Doctrine changed the code to permit persistent connections in the new versions. We can't have good database performance, if we initiate a new connection to the database for each request.

Please remember:

a benchmark is not a competition, it is a very good tool that help to fix our code for better performance.

withinboredom commented 2 months ago

@joanhey I'm not talking about database connections, but TCP/HTTP connections from the client. If you have 100 users, those 100 users do not share a TCP anything. Your server needs to handle 100 separate connections and scale to the physical demands of the connection, network card, and kernel, such that if you have 10,000 users, your server can handle at least 10,000 connections.

If you are behind a load balancer, it can reuse connections (in a pool) such that 1,000 users results in less than 1,000 connections.

When benchmark testing, I typically consider how it is designed to be used. If it will be behind a load balancer, I will reuse connections because that is more representative of its real life access patterns. If it will be on the edge, I am more curious of how it performs not reusing connections. Especially if I am the one paying for the infra.

In any case, it doesn't matter here. There is a performance issue and both the benchmarks illustrate that fact.

dunglas commented 1 month ago

@alexskra thanks for the report! The latest version should improve the situation.