withinboredom commented 10 months ago

This is an issue to perform some "scientific" benchmarks, leveraging 20 years of finagling this stuff. Scientific is in quotes because I intend this to be a scientific endeavor, but read on.

Goals:

Fully tuned php.ini
Fully tuned nginx + fpm
Fully tuned apache
Discover caddy + FrankenPHP's best tuning
- What Go environment variables make a difference, and how to set them
- Compilation options?
- worker mode vs. cgi mode vs. static binary
Documentation to perform the benchmarks
- kernel settings/parameters
- device characteristics (arm/x86/etc, cores, etc)
- use rented device that is cheap and always available -- anyone should be able to reproduce -- but also represent realistic production machines.

Non-goals:

Generic benchmarking suite (I am not building anything)
Framework benchmarks (Not comparing frameworks)

Known Caveats:

Extensions and code will make a difference, but if the documentation is good enough, people can perform their benchmarks with their desired configuration.
The PHP code under test must be chosen carefully to illustrate typical PHP characteristics from the perspective of a SAPI (setting headers, outputting data, JIT'able code, etc.)
From casual testing, we're much more likely to saturate network links long before CPU with FrankenPHP, thus we either need high performing links, or underpowered machines. Needs further investigation.

Considerations:

Considering terraform for technical documentation + writeup of approach/whys.
We will probably want to have #440 released by PHP to test CGI mode properly.
Many people use containers/Kubernetes; testing multiple nginx|caddy|apache running on the same machine in separate containers would be interesting. Mostly out of curiosity: does horizontal scaling on the same machine make a difference, or is vertical scaling better?

withinboredom commented 10 months ago

Reporting progress so far:

Here's the terraform to create the testing infrastructure:

terraform {
  required_providers {
    digitalocean = {
      source  = "digitalocean/digitalocean"
      version = "~> 2.0"
    }
  }
}

provider "digitalocean" {}

resource "digitalocean_droplet" "server" {
  image = "ubuntu-23-10-x64"
  name = "bm-server"
  region = "ams3"
  size = "c2-16vcpu-32gb-intel"
  monitoring = true
  ipv6 = true
  tags = ["benchmarks"]
  ssh_keys = [40738898]

  connection {
    host = self.ipv4_address
    user = "root"
    type = "ssh"
    private_key = file("~/.ssh/id_rsa")
    timeout = "2m"
  }

  provisioner "remote-exec" {
    inline = [
        "sleep 10", # wait for post-boot checks
        "apt-get update",
        "DEBIAN_FRONTEND=noninteractive apt-get upgrade -yqq",
        "DEBIAN_FRONTEND=noninteractive apt-get autoremove -yqq",
        "DEBIAN_FRONTEND=noninteractive apt-get -yqq install nginx php-fpm libapache2-mod-php apache2",
        "systemctl disable nginx apache2 php-fpm",
        "systemctl stop nginx apache2 php-fpm"
    ]
  }

  # allow apache/php-fpm to just work
  provisioner "file" {
    source = "benchmark.php"
    destination = "/var/www/html/benchmark.php"
  }

  # allow frankenphp to just work
  provisioner "file" {
    source = "benchmark.php"
    destination = "/app/public/index.php"
  }

  # install the default caddy file
  provisioner "file" {
    source = "../../caddy/frankenphp/Caddyfile"
    destination = "/etc/caddy/Caddyfile"
  }
}

resource "digitalocean_droplet" "client" {
  image = "ubuntu-23-10-x64"
  name = "bm-client"
  region = "ams3"
  size = "c2-16vcpu-32gb-intel"
  monitoring = true
  ipv6 = true
  tags = ["benchmarks"]
  ssh_keys = [40738898]

  connection {
    host = self.ipv4_address
    user = "root"
    type = "ssh"
    private_key = file("~/.ssh/id_rsa")
    timeout = "2m"
  }

  provisioner "remote-exec" {
    inline = [
        "sleep 10", # wait for post-boot checks
        "apt-get update",
        "DEBIAN_FRONTEND=noninteractive apt-get upgrade -yqq",
        "DEBIAN_FRONTEND=noninteractive apt-get autoremove -yqq",
        "gpg -k",
        "gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69",
        "echo \"deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main\" > /etc/apt/sources.list.d/k6.list",
        "apt-get update",
        "DEBIAN_FRONTEND=noninteractive apt-get install -yqq k6"
    ]
  }
}

Commands I ran on the bm-server to get it ready:

apt update
apt upgrade
apt install nginx php-fpm libapache2-mod-php apache2 wget btop net-tools
systemctl disable nginx apache2 php8.2-fpm
systemctl stop nginx apache2 php8.2-fpm
mkdir -p /app/public
mkdir -p /etc/caddy
wget https://github.com/dunglas/frankenphp/releases/download/v1.0.3/frankenphp-linux-x86_64
chmod +x frankenphp-linux-x86_64
mv frankenphp-linux-x86_64 /usr/local/bin/frankenphp
reboot
# wait

sysctl net.core.somaxconn=1024
ifconfig eth0 txqueuelen 5000
sysctl net.core.netdev_max_backlog=2000
sysctl net.ipv4.tcp_max_syn_backlog=2048

FRANKENPHP_CONFIG="worker /app/public/index.php 32" GOGC=3200 SERVER_NAME=":81" frankenphp run -c /etc/caddy/Caddyfile > /dev/null 2>&1 &

And the bm-client:

apt update
apt upgrade
apt install k6
reboot

# wait

sysctl net.ipv4.ip_local_port_range="15000 61000"
sysctl net.ipv4.tcp_fin_timeout=30
sysctl net.ipv4.tcp_tw_recycle=1
sysctl net.ipv4.tcp_tw_reuse=1

I used the following nginx config:

user www-data;
worker_processes auto;
pid /run/nginx.pid;
error_log /var/log/nginx/error.log;
include /etc/nginx/modules-enabled/*.conf;

events {
        worker_connections 15000;
        # multi_accept on;
}

http {

        ##
        # Basic Settings
        ##

        sendfile on;
        tcp_nopush on;
        types_hash_max_size 2048;
        # server_tokens off;

        # server_names_hash_bucket_size 64;
        # server_name_in_redirect off;

        include /etc/nginx/mime.types;
        default_type application/octet-stream;

        ##
        # SSL Settings
        ##

        ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3; # Dropping SSLv3, ref: POODLE
        ssl_prefer_server_ciphers on;

        ##
        # Logging Settings
        ##

        access_log off;

        ##
        # Gzip Settings
        ##

        gzip on;

        # gzip_vary on;
        # gzip_proxied any;
        # gzip_comp_level 6;
        # gzip_buffers 16 8k;
        # gzip_http_version 1.1;
        # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

        ##
        # Virtual Host Configs
        ##

        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;
}

#mail {
#       # See sample authentication script at:
#       # http://wiki.nginx.org/ImapAuthenticateWithApachePhpScript
#
#       # auth_http localhost/auth.php;
#       # pop3_capabilities "TOP" "USER";
#       # imap_capabilities "IMAP4rev1" "UIDPLUS";
#
#       server {
#               listen     localhost:110;
#               protocol   pop3;
#               proxy      on;
#       }
#
#       server {
#               listen     localhost:143;
#               protocol   imap;
#               proxy      on;
#       }
#}

and the following site config:

##
# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# https://www.nginx.com/resources/wiki/start/
# https://www.nginx.com/resources/wiki/start/topics/tutorials/config_pitfalls/
# https://wiki.debian.org/Nginx/DirectoryStructure
#
# In most cases, administrators will remove this file from sites-enabled/ and
# leave it as reference inside of sites-available where it will continue to be
# updated by the nginx packaging team.
#
# This file will automatically load configuration files provided by other
# applications, such as Drupal or Wordpress. These applications will be made
# available underneath a path with that package name, such as /drupal8.
#
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.
##

# Default server configuration
#
server {
        listen 80 default_server;
        listen [::]:80 default_server;

        # SSL configuration
        #
        # listen 443 ssl default_server;
        # listen [::]:443 ssl default_server;
        #
        # Note: You should disable gzip for SSL traffic.
        # See: https://bugs.debian.org/773332
        #
        # Read up on ssl_ciphers to ensure a secure configuration.
        # See: https://bugs.debian.org/765782
        #
        # Self signed certs generated by the ssl-cert package
        # Don't use them in a production server!
        #
        # include snippets/snakeoil.conf;

        root /var/www/html;

        # Add index.php to the list if you are using PHP
        index index.html index.htm index.nginx-debian.html;

        server_name _;

        location / {
                # First attempt to serve request as file, then
                # as directory, then fall back to displaying a 404.
                try_files $uri $uri/ =404;
        }

        # pass PHP scripts to FastCGI server
        #
        location ~ \.php$ {
                include snippets/fastcgi-php.conf;
        #
        #       # With php-fpm (or other unix sockets):
                fastcgi_pass unix:/run/php/php8.2-fpm.sock;
        #       # With php-cgi (or other tcp sockets):
        #       fastcgi_pass 127.0.0.1:9000;
        }

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #       deny all;
        #}
}

# Virtual Host configuration for example.com
#
# You can move that to a different file under sites-available/ and symlink that
# to sites-enabled/ to enable it.
#
#server {
#       listen 80;
#       listen [::]:80;
#
#       server_name example.com;
#
#       root /var/www/example.com;
#       index index.html;
#
#       location / {
#               try_files $uri $uri/ =404;
#       }
#}

with the following fpm config:

; Start a new pool named 'www'.
; the variable $pool can be used in any directive and will be replaced by the
; pool name ('www' here)
[www]

; Per pool prefix
; It only applies on the following directives:
; - 'access.log'
; - 'slowlog'
; - 'listen' (unixsocket)
; - 'chroot'
; - 'chdir'
; - 'php_values'
; - 'php_admin_values'
; When not set, the global prefix (or /usr) applies instead.
; Note: This directive can also be relative to the global prefix.
; Default Value: none
;prefix = /path/to/pools/$pool

; Unix user/group of the child processes. This can be used only if the master
; process running user is root. It is set after the child process is created.
; The user and group can be specified either by their name or by their numeric
; IDs.
; Note: If the user is root, the executable needs to be started with
;       --allow-to-run-as-root option to work.
; Default Values: The user is set to master process running user by default.
;                 If the group is not set, the user's group is used.
user = www-data
group = www-data

; The address on which to accept FastCGI requests.
; Valid syntaxes are:
;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific IPv4 address on
;                            a specific port;
;   '[ip:6:addr:ess]:port' - to listen on a TCP socket to a specific IPv6 address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses
;                            (IPv6 and IPv4-mapped) on a specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
listen = /run/php/php8.2-fpm.sock

; Set listen(2) backlog.
; Default Value: 511 (-1 on Linux, FreeBSD and OpenBSD)
;listen.backlog = 511

; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions. The owner
; and group can be specified either by name or by their numeric IDs.
; Default Values: Owner is set to the master process running user. If the group
;                 is not set, the owner's group is used. Mode is set to 0660.
listen.owner = www-data
listen.group = www-data
;listen.mode = 0660

; When POSIX Access Control Lists are supported you can set them using
; these options, value is a comma separated list of user/group names.
; When set, listen.owner and listen.group are ignored
;listen.acl_users =
;listen.acl_groups =

; List of addresses (IPv4/IPv6) of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any
;listen.allowed_clients = 127.0.0.1

; Set the associated the route table (FIB). FreeBSD only
; Default Value: -1
;listen.setfib = 1

; Specify the nice(2) priority to apply to the pool processes (only if set)
; The value can vary from -19 (highest priority) to 20 (lower priority)
; Note: - It will only work if the FPM master process is launched as root
;       - The pool processes will inherit the master process priority
;         unless it specified otherwise
; Default Value: no set
; process.priority = -19

; Set the process dumpable flag (PR_SET_DUMPABLE prctl for Linux or
; PROC_TRACE_CTL procctl for FreeBSD) even if the process user
; or group is different than the master process user. It allows to create process
; core dump and ptrace the process for the pool user.
; Default Value: no
; process.dumpable = yes

; Choose how the process manager will control the number of child processes.
; Possible Values:
;   static  - a fixed number (pm.max_children) of child processes;
;   dynamic - the number of child processes are set dynamically based on the
;             following directives. With this process management, there will be
;             always at least 1 children.
;             pm.max_children      - the maximum number of children that can
;                                    be alive at the same time.
;             pm.start_servers     - the number of children created on startup.
;             pm.min_spare_servers - the minimum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is less than this
;                                    number then some children will be created.
;             pm.max_spare_servers - the maximum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is greater than this
;                                    number then some children will be killed.
;             pm.max_spawn_rate    - the maximum number of rate to spawn child
;                                    processes at once.
;  ondemand - no children are created at startup. Children will be forked when
;             new requests will connect. The following parameter are used:
;             pm.max_children           - the maximum number of children that
;                                         can be alive at the same time.
;             pm.process_idle_timeout   - The number of seconds after which
;                                         an idle process will be killed.
; Note: This value is mandatory.
pm = dynamic

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes when pm is set to 'dynamic' or 'ondemand'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI. The below defaults are based on a server without much resources. Don't
; forget to tweak pm.* to fit your needs.
; Note: Used when pm is set to 'static', 'dynamic' or 'ondemand'
; Note: This value is mandatory.
pm.max_children = 32

; The number of child processes created on startup.
; Note: Used only when pm is set to 'dynamic'
; Default Value: (min_spare_servers + max_spare_servers) / 2
pm.start_servers = 2

; The desired minimum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.min_spare_servers = 1

; The desired maximum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.max_spare_servers = 3

; The number of rate to spawn child processes at once.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
; Default Value: 32
;pm.max_spawn_rate = 32

; The number of seconds after which an idle process will be killed.
; Note: Used only when pm is set to 'ondemand'
; Default Value: 10s
;pm.process_idle_timeout = 10s;

; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
;pm.max_requests = 500

; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. It shows the following information:
;   pool                 - the name of the pool;
;   process manager      - static, dynamic or ondemand;
;   start time           - the date and time FPM has started;
;   start since          - number of seconds since FPM has started;
;   accepted conn        - the number of request accepted by the pool;
;   listen queue         - the number of request in the queue of pending
;                          connections (see backlog in listen(2));
;   max listen queue     - the maximum number of requests in the queue
;                          of pending connections since FPM has started;
;   listen queue len     - the size of the socket queue of pending connections;
;   idle processes       - the number of idle processes;
;   active processes     - the number of active processes;
;   total processes      - the number of idle + active processes;
;   max active processes - the maximum number of active processes since FPM
;                          has started;
;   max children reached - number of times, the process limit has been reached,
;                          when pm tries to start more children (works only for
;                          pm 'dynamic' and 'ondemand');
; Value are updated in real time.
; Example output:
;   pool:                 www
;   process manager:      static
;   start time:           01/Jul/2011:17:53:49 +0200
;   start since:          62636
;   accepted conn:        190460
;   listen queue:         0
;   max listen queue:     1
;   listen queue len:     42
;   idle processes:       4
;   active processes:     11
;   total processes:      15
;   max active processes: 12
;   max children reached: 0
;
; By default the status page output is formatted as text/plain. Passing either
; 'html', 'xml' or 'json' in the query string will return the corresponding
; output syntax. Example:
;   http://www.foo.bar/status
;   http://www.foo.bar/status?json
;   http://www.foo.bar/status?html
;   http://www.foo.bar/status?xml
;
; By default the status page only outputs short status. Passing 'full' in the
; query string will also return status for each pool process.
; Example:
;   http://www.foo.bar/status?full
;   http://www.foo.bar/status?json&full
;   http://www.foo.bar/status?html&full
;   http://www.foo.bar/status?xml&full
; The Full status returns for each process:
;   pid                  - the PID of the process;
;   state                - the state of the process (Idle, Running, ...);
;   start time           - the date and time the process has started;
;   start since          - the number of seconds since the process has started;
;   requests             - the number of requests the process has served;
;   request duration     - the duration in µs of the requests;
;   request method       - the request method (GET, POST, ...);
;   request URI          - the request URI with the query string;
;   content length       - the content length of the request (only with POST);
;   user                 - the user (PHP_AUTH_USER) (or '-' if not set);
;   script               - the main script called (or '-' if not set);
;   last request cpu     - the %cpu the last request consumed
;                          it's always 0 if the process is not in Idle state
;                          because CPU calculation is done when the request
;                          processing has terminated;
;   last request memory  - the max amount of memory the last request consumed
;                          it's always 0 if the process is not in Idle state
;                          because memory calculation is done when the request
;                          processing has terminated;
; If the process is in Idle state, then informations are related to the
; last request the process has served. Otherwise informations are related to
; the current request being served.
; Example output:
;   ************************
;   pid:                  31330
;   state:                Running
;   start time:           01/Jul/2011:17:53:49 +0200
;   start since:          63087
;   requests:             12808
;   request duration:     1250261
;   request method:       GET
;   request URI:          /test_mem.php?N=10000
;   content length:       0
;   user:                 -
;   script:               /home/fat/web/docs/php/test_mem.php
;   last request cpu:     0.00
;   last request memory:  0
;
; Note: There is a real-time FPM status monitoring sample web page available
;       It's available in: /usr/share/php/8.2/fpm/status.html
;
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;pm.status_path = /status

; The address on which to accept FastCGI status request. This creates a new
; invisible pool that can handle requests independently. This is useful
; if the main pool is busy with long running requests because it is still possible
; to get the status before finishing the long running requests.
;
; Valid syntaxes are:
;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific IPv4 address on
;                            a specific port;
;   '[ip:6:addr:ess]:port' - to listen on a TCP socket to a specific IPv6 address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses
;                            (IPv6 and IPv4-mapped) on a specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Default Value: value of the listen option
;pm.status_listen = 127.0.0.1:9001

; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;ping.path = /ping

; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong

; The access log file
; Default: not set
;access.log = log/$pool.access.log

; The access log format.
; The following syntax is allowed
;  %%: the '%' character
;  %C: %CPU used by the request
;      it can accept the following format:
;      - %{user}C for user CPU only
;      - %{system}C for system CPU only
;      - %{total}C  for user + system CPU (default)
;  %d: time taken to serve the request
;      it can accept the following format:
;      - %{seconds}d (default)
;      - %{milliseconds}d
;      - %{milli}d
;      - %{microseconds}d
;      - %{micro}d
;  %e: an environment variable (same as $_ENV or $_SERVER)
;      it must be associated with embraces to specify the name of the env
;      variable. Some examples:
;      - server specifics like: %{REQUEST_METHOD}e or %{SERVER_PROTOCOL}e
;      - HTTP headers like: %{HTTP_HOST}e or %{HTTP_USER_AGENT}e
;  %f: script filename
;  %l: content-length of the request (for POST request only)
;  %m: request method
;  %M: peak of memory allocated by PHP
;      it can accept the following format:
;      - %{bytes}M (default)
;      - %{kilobytes}M
;      - %{kilo}M
;      - %{megabytes}M
;      - %{mega}M
;  %n: pool name
;  %o: output header
;      it must be associated with embraces to specify the name of the header:
;      - %{Content-Type}o
;      - %{X-Powered-By}o
;      - %{Transfert-Encoding}o
;      - ....
;  %p: PID of the child that serviced the request
;  %P: PID of the parent of the child that serviced the request
;  %q: the query string
;  %Q: the '?' character if query string exists
;  %r: the request URI (without the query string, see %q and %Q)
;  %R: remote IP address
;  %s: status (response code)
;  %t: server time the request was received
;      it can accept a strftime(3) format:
;      %d/%b/%Y:%H:%M:%S %z (default)
;      The strftime(3) format must be encapsulated in a %{<strftime_format>}t tag
;      e.g. for a ISO8601 formatted timestring, use: %{%Y-%m-%dT%H:%M:%S%z}t
;  %T: time the log has been written (the request has finished)
;      it can accept a strftime(3) format:
;      %d/%b/%Y:%H:%M:%S %z (default)
;      The strftime(3) format must be encapsulated in a %{<strftime_format>}t tag
;      e.g. for a ISO8601 formatted timestring, use: %{%Y-%m-%dT%H:%M:%S%z}t
;  %u: remote user
;
; Default: "%R - %u %t \"%m %r\" %s"
;access.format = "%R - %u %t \"%m %r%Q%q\" %s %f %{milli}d %{kilo}M %C%%"

; A list of request_uri values which should be filtered from the access log.
;
; As a security precuation, this setting will be ignored if:
;     - the request method is not GET or HEAD; or
;     - there is a request body; or
;     - there are query parameters; or
;     - the response code is outwith the successful range of 200 to 299
;
; Note: The paths are matched against the output of the access.format tag "%r".
;       On common configurations, this may look more like SCRIPT_NAME than the
;       expected pre-rewrite URI.
;
; Default Value: not set
;access.suppress_path[] = /ping
;access.suppress_path[] = /health_check.php

; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
;slowlog = log/$pool.log.slow

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_slowlog_timeout = 0

; Depth of slow log stack trace.
; Default Value: 20
;request_slowlog_trace_depth = 20

; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_terminate_timeout = 0

; The timeout set by 'request_terminate_timeout' ini option is not engaged after
; application calls 'fastcgi_finish_request' or when application has finished and
; shutdown functions are being called (registered via register_shutdown_function).
; This option will enable timeout limit to be applied unconditionally
; even in such cases.
; Default Value: no
;request_terminate_timeout_track_finished = no

; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024

; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0

; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: you can prefix with '$prefix' to chroot to the pool prefix or one
; of its subdirectories. If the pool prefix is not set, the global prefix
; will be used instead.
; Note: chrooting is a great security feature and should be used whenever
;       possible. However, all PHP paths will be relative to the chroot
;       (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =

; Chdir to this directory at the start.
; Note: relative path can be used.
; Default Value: current directory or / when chroot
;chdir = /var/www

; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Note: on highloaded environment, this can cause some delay in the page
; process time (several ms).
; Default Value: no
;catch_workers_output = yes

; Decorate worker output with prefix and suffix containing information about
; the child that writes to the log and if stdout or stderr is used as well as
; log level and time. This options is used only if catch_workers_output is yes.
; Settings to "no" will output data as written to the stdout or stderr.
; Default value: yes
;decorate_workers_output = no

; Clear environment in FPM workers
; Prevents arbitrary environment variables from reaching FPM worker processes
; by clearing the environment in workers before env vars specified in this
; pool configuration are added.
; Setting to "no" will make all environment variables available to PHP code
; via getenv(), $_ENV and $_SERVER.
; Default Value: yes
;clear_env = no

; Limits the extensions of the main script FPM will allow to parse. This can
; prevent configuration mistakes on the web server side. You should only limit
; FPM to .php extensions to prevent malicious users to use other extensions to
; execute php code.
; Note: set an empty value to allow all extensions.
; Default Value: .php
;security.limit_extensions = .php .php3 .php4 .php5 .php7

; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp

; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
;   php_value/php_flag             - you can set classic ini defines which can
;                                    be overwritten from PHP call 'ini_set'.
;   php_admin_value/php_admin_flag - these directives won't be overwritten by
;                                     PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.

; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.

; Note: path INI options can be relative and will be expanded with the prefix
; (pool, global or /usr)

; Default Value: nothing is defined by default except the values in php.ini and
;                specified at startup with the -d argument
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f www@my.domain.com
;php_flag[display_errors] = off
;php_admin_value[error_log] = /var/log/fpm-php.www.log
;php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 32M

I didn't adjust the php.ini, since we aren't exactly benchmarking PHP here, but the SAPI.

Then, with the following PHP file:

<?php

ignore_user_abort(true);

function handle_request(): void
{
        header('Content-Type: application/json');
        $data = $_POST['data'] ?? null;
        $cookie = $_COOKIE['cookie'] ?? null;
        $body = json_encode(['data' => $data, 'cookie' => $cookie]);
        header('Content-Length: ' . strlen($body));
        echo $body;
}

if($_SERVER['FRANKENPHP_WORKER'] ?? false) {
        while (frankenphp_handle_request(handle_request(...))) {}
}

handle_request();

Finally, I used the default Caddyfile with this repo, and the following two load test files

// load-test-nginx.js
import http from 'k6/http'
import { check } from 'k6'

export const options = {
  // A number specifying the number of VUs to run concurrently.
  vus: 100,
  // A string specifying the total duration of the test run.
  duration: '30s'

  // The following section contains configuration options for execution of this
  // test script in Grafana Cloud.
  //
  // See https://grafana.com/docs/grafana-cloud/k6/get-started/run-cloud-tests-from-the-cli/
  // to learn about authoring and running k6 test scripts in Grafana k6 Cloud.
  //
  // ext: {
  //   loadimpact: {
  //     // The ID of the project to which the test is assigned in the k6 Cloud UI.
  //     // By default tests are executed in default project.
  //     projectID: "",
  //     // The name of the test in the k6 Cloud UI.
  //     // Test runs with the same name will be grouped.
  //     name: "script.js"
  //   }
  // },

  // Uncomment this section to enable the use of Browser API in your tests.
  //
  // See https://grafana.com/docs/k6/latest/using-k6-browser/running-browser-tests/ to learn more
  // about using Browser API in your test scripts.
  //
  // scenarios: {
  //   // The scenario name appears in the result summary, tags, and so on.
  //   // You can give the scenario any name, as long as each name in the script is unique.
  //   ui: {
  //     // Executor is a mandatory parameter for browser-based tests.
  //     // Shared iterations in this case tells k6 to reuse VUs to execute iterations.
  //     //
  //     // See https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/ for other executor types.
  //     executor: 'shared-iterations',
  //     options: {
  //       browser: {
  //         // This is a mandatory parameter that instructs k6 to launch and
  //         // connect to a chromium-based browser, and use it to run UI-based
  //         // tests.
  //         type: 'chromium',
  //       },
  //     },
  //   },
  // }
}

const payload = 'data=test'

// The function that defines VU logic.
//
// See https://grafana.com/docs/k6/latest/examples/get-started-with-k6/ to learn more
// about authoring k6 scripts.
//
export default function () {
  const res = http.post('http://146.190.235.7/benchmark.php', payload)
  check(res, {
    'is status 200': (r) => r.status === 200,
    }
  })
}

// load-test-franken.js
import http from 'k6/http'
import { check } from 'k6'

export const options = {
  // A number specifying the number of VUs to run concurrently.
  vus: 100,
  // A string specifying the total duration of the test run.
  duration: '30s'

  // The following section contains configuration options for execution of this
  // test script in Grafana Cloud.
  //
  // See https://grafana.com/docs/grafana-cloud/k6/get-started/run-cloud-tests-from-the-cli/
  // to learn about authoring and running k6 test scripts in Grafana k6 Cloud.
  //
  // ext: {
  //   loadimpact: {
  //     // The ID of the project to which the test is assigned in the k6 Cloud UI.
  //     // By default tests are executed in default project.
  //     projectID: "",
  //     // The name of the test in the k6 Cloud UI.
  //     // Test runs with the same name will be grouped.
  //     name: "script.js"
  //   }
  // },

  // Uncomment this section to enable the use of Browser API in your tests.
  //
  // See https://grafana.com/docs/k6/latest/using-k6-browser/running-browser-tests/ to learn more
  // about using Browser API in your test scripts.
  //
  // scenarios: {
  //   // The scenario name appears in the result summary, tags, and so on.
  //   // You can give the scenario any name, as long as each name in the script is unique.
  //   ui: {
  //     // Executor is a mandatory parameter for browser-based tests.
  //     // Shared iterations in this case tells k6 to reuse VUs to execute iterations.
  //     //
  //     // See https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/ for other executor types.
  //     executor: 'shared-iterations',
  //     options: {
  //       browser: {
  //         // This is a mandatory parameter that instructs k6 to launch and
  //         // connect to a chromium-based browser, and use it to run UI-based
  //         // tests.
  //         type: 'chromium',
  //       },
  //     },
  //   },
  // }
}

const payload = 'data=test'

// The function that defines VU logic.
//
// See https://grafana.com/docs/k6/latest/examples/get-started-with-k6/ to learn more
// about authoring k6 scripts.
//
export default function () {
  const res = http.post('http://146.190.235.7:81', payload)
  check(res, {
    'is status 200': (r) => r.status === 200,
    }
  })
}

I'll leave another comment with the results and some preliminary conclusions.

withinboredom commented 10 months ago

Raw results:

k6 run load-test-franken.js -u 1000 --no-connection-reuse

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test-franken.js
     output: -

  scenarios: (100.00%) 1 scenario, 1000 max VUs, 1m0s max duration (incl. graceful stop):
           * default: 1000 looping VUs for 30s (gracefulStop: 30s)

     ✓ is status 200

     checks.........................: 100.00% ✓ 995134       ✗ 0
     data_received..................: 193 MB  6.4 MB/s
     data_sent......................: 129 MB  4.3 MB/s
     http_req_blocked...............: avg=856.65µs min=137.4µs  med=242.13µs max=65.11ms  p(90)=1.79ms  p(95)=4.24ms
     http_req_connecting............: avg=694.87µs min=107.78µs med=208.11µs max=62.91ms  p(90)=1.48ms  p(95)=3.52ms
     http_req_duration..............: avg=29.03ms  min=502.98µs med=28.68ms  max=104.21ms p(90)=39.49ms p(95)=43.57ms
       { expected_response:true }...: avg=29.03ms  min=502.98µs med=28.68ms  max=104.21ms p(90)=39.49ms p(95)=43.57ms
     http_req_failed................: 0.00%   ✓ 0            ✗ 995134
     http_req_receiving.............: avg=1.26ms   min=12.92µs  med=52.69µs  max=51.64ms  p(90)=5.03ms  p(95)=7.86ms
     http_req_sending...............: avg=742.23µs min=7.87µs   med=29.36µs  max=51.64ms  p(90)=2.23ms  p(95)=4.73ms
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s      p(95)=0s
     http_req_waiting...............: avg=27.02ms  min=386.63µs med=27.64ms  max=90.61ms  p(90)=36.63ms p(95)=39.92ms
     http_reqs......................: 995134  33150.822124/s
     iteration_duration.............: avg=30.1ms   min=757.69µs med=29.29ms  max=115.13ms p(90)=40.43ms p(95)=44.84ms
     iterations.....................: 995134  33150.822124/s
     vus............................: 1000    min=1000       max=1000
     vus_max........................: 1000    min=1000       max=1000

running (0m30.0s), 0000/1000 VUs, 995134 complete and 0 interrupted iterations
default ✓ [======================================] 1000 VUs  30s

And nginx:

k6 run load-test.js -u 1000 --no-connection-reuse

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test.js
     output: -

  scenarios: (100.00%) 1 scenario, 1000 max VUs, 1m0s max duration (incl. graceful stop):
           * default: 1000 looping VUs for 30s (gracefulStop: 30s)

     ✓ is status 200

     checks.........................: 100.00% ✓ 1155748     ✗ 0
     data_received..................: 214 MB  7.1 MB/s
     data_sent......................: 162 MB  5.4 MB/s
     http_req_blocked...............: avg=1.24ms   min=131.19µs med=242.8µs  max=67.3ms   p(90)=3.41ms  p(95)=6.57ms
     http_req_connecting............: avg=916.21µs min=113.08µs med=211.15µs max=67.25ms  p(90)=2.58ms  p(95)=5ms
     http_req_duration..............: avg=24.2ms   min=306.62µs med=22.94ms  max=80.3ms   p(90)=35.46ms p(95)=39.25ms
       { expected_response:true }...: avg=24.2ms   min=306.62µs med=22.94ms  max=80.3ms   p(90)=35.46ms p(95)=39.25ms
     http_req_failed................: 0.00%   ✓ 0           ✗ 1155748
     http_req_receiving.............: avg=1.65ms   min=12.26µs  med=48.06µs  max=58.29ms  p(90)=6.78ms  p(95)=9.81ms
     http_req_sending...............: avg=989.05µs min=8.01µs   med=27.39µs  max=58.41ms  p(90)=3.24ms  p(95)=5.71ms
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s      p(95)=0s
     http_req_waiting...............: avg=21.56ms  min=264.85µs med=22.44ms  max=64.54ms  p(90)=30.24ms p(95)=33.25ms
     http_reqs......................: 1155748 38504.81304/s
     iteration_duration.............: avg=25.88ms  min=628.86µs med=23.68ms  max=117.11ms p(90)=37.59ms p(95)=42.4ms
     iterations.....................: 1155748 38504.81304/s
     vus............................: 1000    min=1000      max=1000
     vus_max........................: 1000    min=1000      max=1000

running (0m30.0s), 0000/1000 VUs, 1155748 complete and 0 interrupted iterations
default ✓ [======================================] 1000 VUs  30s

Note, they are pretty inline with each other, the biggest difference is at that it appears caddy/FrankenPHP can take nearly unlimited traffic, while nginx sheds traffic.

k6 run load-test-franken.js -u 10000 --no-connection-reuse

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test-franken.js
     output: -

  scenarios: (100.00%) 1 scenario, 10000 max VUs, 1m0s max duration (incl. graceful stop):
           * default: 10000 looping VUs for 30s (gracefulStop: 30s)

     ✓ is status 200

     checks.........................: 100.00% ✓ 958586       ✗ 0
     data_received..................: 186 MB  6.0 MB/s
     data_sent......................: 125 MB  4.0 MB/s
     http_req_blocked...............: avg=120.6ms  min=133.53µs med=279.75µs max=4.17s    p(90)=175.66ms p(95)=1.02s
     http_req_connecting............: avg=120.51ms min=116.8µs  med=252.4µs  max=4.16s    p(90)=175.45ms p(95)=1.02s
     http_req_duration..............: avg=188.91ms min=564.86µs med=182.37ms max=1.83s    p(90)=267.46ms p(95)=297ms
       { expected_response:true }...: avg=188.91ms min=564.86µs med=182.37ms max=1.83s    p(90)=267.46ms p(95)=297ms
     http_req_failed................: 0.00%   ✓ 0            ✗ 958586
     http_req_receiving.............: avg=3.28ms   min=14.92µs  med=77.65µs  max=151.64ms p(90)=9.31ms   p(95)=18.75ms
     http_req_sending...............: avg=2.21ms   min=7.68µs   med=56.94µs  max=138.24ms p(90)=6.68ms   p(95)=8.87ms
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=183.41ms min=419.73µs med=179.85ms max=1.83s    p(90)=256.24ms p(95)=286.53ms
     http_reqs......................: 958586  31124.388239/s
     iteration_duration.............: avg=312.35ms min=45.1ms   med=198.63ms max=4.37s    p(90)=543.2ms  p(95)=1.21s
     iterations.....................: 958586  31124.388239/s
     vus............................: 283     min=283        max=10000
     vus_max........................: 10000   min=10000      max=10000

running (0m30.8s), 00000/10000 VUs, 958586 complete and 0 interrupted iterations
default ✓ [======================================] 10000 VUs  30s

and nginx:

k6 run load-test.js -u 10000 --no-connection-reuse

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test.js
     output: -

  scenarios: (100.00%) 1 scenario, 10000 max VUs, 1m0s max duration (incl. graceful stop):
           * default: 10000 looping VUs for 30s (gracefulStop: 30s)

     ✗ is status 200
      ↳  40% — ✓ 508842 / ✗ 739342

     checks.........................: 40.76%  ✓ 508842       ✗ 739342
     data_received..................: 337 MB  11 MB/s
     data_sent......................: 175 MB  5.8 MB/s
     http_req_blocked...............: avg=149.13ms min=134.39µs med=51.79ms  max=3.22s    p(90)=209.33ms p(95)=1.05s
     http_req_connecting............: avg=148.63ms min=117.69µs med=51.6ms   max=3.22s    p(90)=203.27ms p(95)=1.04s
     http_req_duration..............: avg=80.63ms  min=248.29µs med=67.74ms  max=629.84ms p(90)=180.38ms p(95)=220.82ms
       { expected_response:true }...: avg=92.54ms  min=428.29µs med=50.35ms  max=629.84ms p(90)=226.63ms p(95)=281.72ms
     http_req_failed................: 59.23%  ✓ 739342       ✗ 508842
     http_req_receiving.............: avg=5.63ms   min=14.67µs  med=5.74ms   max=282.07ms p(90)=9.85ms   p(95)=12.7ms
     http_req_sending...............: avg=5.45ms   min=8.16µs   med=5.16ms   max=233.81ms p(90)=8.96ms   p(95)=11.36ms
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=69.55ms  min=167.26µs med=55.82ms  max=625.1ms  p(90)=162.86ms p(95)=207.5ms
     http_reqs......................: 1248184 41478.797471/s
     iteration_duration.............: avg=237.72ms min=548.72µs med=142.65ms max=3.5s     p(90)=519.76ms p(95)=1.13s
     iterations.....................: 1248184 41478.797471/s
     vus............................: 10000   min=10000      max=10000
     vus_max........................: 10000   min=10000      max=10000

running (0m30.1s), 00000/10000 VUs, 1248184 complete and 0 interrupted iterations
default ✓ [======================================] 10000 VUs  30s

Still digging into this (no conclusions can be made yet), but thought I'd report on progress so far.

dunglas commented 10 months ago

Great! This will be super helpful!

Maybe would it be better to compare dynamic builds with dynamic builds. The static build is known to be slower than the dynamic build of PHP (no JIT, more extensions, etc). Maybe should we use Docker to ease the process?

Also, the default Caddyfile in this repo enables more features than what is enabled in NGINX. For instance, by default logs are on, but off in NGINX (this can make a huge difference). Also, HTTP/2 and HTTP/3 are on for Caddy, but not for NGINX.

Finally, for the worker mode, maybe could it be more interesting to compare a more real-life app, like a Symfony or a Laravel app (for a simple "hello world" script like this, the worker mode is mostly useless).

nickchomey commented 10 months ago

I also wonder if nginx could/should be tweaked - such a high failure rate seems suspicious. Perhaps there's a max duration parameter that could be modified so as to better match what's happening with caddy?

Though, this comprehensive benchmark between Caddy and nginx seems to have concluded something similar - nginx favors latency over completion/no errors.

https://blog.tjll.net/reverse-proxy-hot-dog-eating-contest-caddy-vs-nginx/

You might even consider just starting from the terraform, configs and tests that they used and provide, and adding Frankenphp to it, along with a dynamic web app (which is surely the real goal here) as dunglas suggested above.

It could probably be fairly assumed that Caddy's performance has improved more than nginx in the time since that test (probably using v2.5.2, so missing progress from v2.6, 2.7). Though, my guess is that the difference between the servers will be negligible when the webapp (and database etc...) is the bottleneck - though hopefully Frankenphp shows a meaningful improvement over caddy/nginx+fpm given its direct-connection

withinboredom commented 10 months ago

by default logs are on, but off in NGINX (this can make a huge difference)

I turned logs off in nginx, but just redirected logs from caddy to /dev/null ... not the same thing by any means, but I'll add that to ensure they are off for both.

maybe could it be more interesting to compare a more real-life app

For sure, though we are getting into 'testing framework' territory and not 'testing sapi' territory. E.g., how fast can the sapi add some headers and output a string (though I would like to exercise the sapi more).

the worker mode is mostly useless

Not really, there is still overhead in switching between go/c/go and I'm mostly curious what it would look like with worker mode being disabled, but we can't test that yet.

Perhaps there's a max duration parameter that could be modified so as to better match what's happening with caddy?

nginx isn't timing out, it's just refusing connections. I've fiddled with max_processes and friends, but I haven't worked out how to get it to handle 10k concurrent connections yet. It's probably something dumb.

my guess is that the difference between the servers will be negligible when the webapp (and database etc...) is the bottleneck

This is exactly why I don't want to test frameworks and is a non-goal. I want to test the sapi, not php.

withinboredom commented 10 months ago

To, add. There's also an issue of bandwidth. Right now, for these tests, we are doing well over 1 gbps at times (especially caddy), sending much bigger bodies would very quickly eat up quite a bit of bandwidth.

In fact, I was surprised by the raw reqs per second here because caddy tended to use more bandwidth. I will investigate this at the packet level, as I also suspect a bug in Caddy (or a library) based on some other benchmarks I took along the way.

That's a good blog post btw @nickchomey, I'll see what I can steal.

withinboredom commented 10 months ago

Yep. I suspect there is a bug... somewhere deep in somewhere.

I'm seeing the server send it's FIN packet sometimes hundreds of milliseconds after receiving one from the client -- particularly when under load. There doesn't appear to be any other delay anywhere else in Caddy. I don't see this in nginx. So in Caddy, the connection is "open" for much longer than it should be.

withinboredom commented 8 months ago

It appears that embedded php is about twice as slow as non-embedded, which doesn't make sense. They should be about equal. I'm investigating this.

binaryfire commented 8 months ago

@withinboredom By embedded, are you referring to static binary builds?

dunglas commented 8 months ago

@withinboredom this doesn't surprise me much. Musl makes PHP slower and prevents JIT from working.

nickchomey commented 8 months ago

It was my impression that JIT is minimally helpful on most real world web apps. What do the benchmarks consist of?

Is opcache working in embedded? Because that tends to make a huge (2x-like) difference

dunglas / frankenphp

Document and perform benchmarks #481

444