swoole / swoole-src

🚀 Coroutine-based concurrency library for PHP
https://www.swoole.com
Apache License 2.0
18.44k stars 3.16k forks source link

v6多线程版本 WebSocket+http 性能大幅度下降 #5476

Closed LIngMax closed 1 month ago

LIngMax commented 1 month ago
$server = new Swoole\WebSocket\Server("0.0.0.0",4455,SWOOLE_THREAD );#SWOOLE_PROCESS SWOOLE_THREAD
$server->set([
    'reactor_num'           => swoole_cpu_num()*4,#reactor线程数 1-4倍 默认值:CPU 核数
    'worker_num'            => swoole_cpu_num()*4+1,#CPU核数的1-4倍 默认值:CPU 核数 Swoole 6多线程版本
    'hook_flags'            => SWOOLE_HOOK_ALL,#开启协程钩子
    'enable_coroutine'      => true,#启用协程
    'task_enable_coroutine' => true,#Task启用协程
    'reload_async'          => true,#触发重启 当前无任何协程时进程才会退出
    'open_http_protocol'    => true,#HTTP协议解析
    'http_parse_cookie'     => false,#自动解析COOKIE请求
    'http_parse_post'       => false,#自动解析POST请求
    'http_parse_files'      => false,#自动解析上传文件
    // 'single_thread'         => true,#单线程模式 SWOOLE_PROCESS强制 ZTS必须开启  cli是NTS所以不用开启

]);
$server->on('message',function ($server, $frame){});
$server->on('request', function ($request, $response){@$response->end("啊哈哈哈哈");});
// $server->on('close',function ($server, $_fd){}); #问题代码
$server->start();

压测http

多线程监听 close 性能下滑严重 9k qps

多线程不监听 close 17k qps

wsl 测试没复现 都是3k qps

华为云 高主频 linux 主机上复现了

$server = new Swoole\WebSocket\Server("0.0.0.0",4455,SWOOLE_PROCESS);#多进程 模式下   32k qps  

初步猜测是 > 15k qps 的 容易复现 (>2.7到 3GHz > 8核 多核高主频主机)

ab -n 40000 -c 200 "http://127.0.0.1:4455/"

备注 阿里云 c7 16核(vCPU) 32 GiB ecs.c7.4xlarge 抢占型 已复现 监听close导致下滑 1w qps

至于华为云Huawei Cloud EulerOS release 2.0 (West Lake) swoole6多线程跟多进程的差距挺大的 多线程 本身下滑一部分 监听close 再下滑一部分 性能

matyhtf commented 1 month ago

请按照 issue 模版提交。缺失版本信息

matyhtf commented 1 month ago

PHP 8.3.1 , Ubuntu 22 下测试,不存在此问题

LIngMax commented 1 month ago

2.你希望看到什么? // $server->on('close',function ($server, $_fd){}); #问题代码 不大幅度影响 ab压测

3.你看到了什么? 执行 ab -n 40000 -c 500 "http://127.0.0.1:4455/"

监听close事件 Requests per second: 17941.01 [#/sec] (mean) 不监听close事件 Requests per second: 26931.78 [#/sec] (mean)

4.您使用的是哪个版本的Swoole(显示您的php-ri-Swoole)?

https://github.com/jingjingxyk/swoole-cli/releases/tag/swoole-cli-v0.0.9

branch | experiment-zts tag | swoole-cli-v0.0.9 swoole version | v6.0.0-dev php version | 8.1.27 release date | 2024-08-20

5.您使用的机器环境是什么(显示您的uname-aphp-vgcc-v)?

阿里云 广州 抢占式 16核(vCPU) 32 GiB 规格 ecs.hfc7.4xlarge 50mb/s宽带 这些系统都复现过
Ubuntu 24.04 LTS Ubuntu 20 LTS

华为云 Huawei Cloud EulerOS release 2.0 (West Lake) Linux api 5.10.0-182.0.0.95.r1941_123.hce2.x86_64 #1 SMP Fri Jun 28 09:41:47 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

总结 qps > 2w的机器都复现了 PHP版本 8.1.27

https://github.com/jingjingxyk/swoole-cli/releases/tag/swoole-cli-v0.0.9

NathanFreeman commented 1 month ago

试试用wrk做压测工具,ab是单线程的

LIngMax commented 1 month ago

试试用wrk做压测工具,ab是单线程的

差距更大了 阿里云 64核 128g 测试结果

多进程55w qps 多线程4w qps 取消close没有效果

多进程 Requests/sec: 557031.73 root@iZj6c1cbgskh81iw2pyj40Z:~/wrk# ./wrk -t12 -c400 -d5s "http://127.0.0.1:4455/" Running 5s test @ http://127.0.0.1:4455/ 12 threads and 400 connections Thread Stats Avg Stdev Max +/- Stdev Latency 717.91us 509.37us 24.97ms 81.95% Req/Sec 46.81k 4.18k 70.07k 75.57% 2841269 requests in 5.10s, 455.22MB read Requests/sec: 557031.73 Transfer/sec: 89.25MB

多线程 Requests/sec: 40573.53 root@iZj6c1cbgskh81iw2pyj40Z:~/wrk# ./wrk -t12 -c400 -d5s "http://127.0.0.1:4455/" Running 5s test @ http://127.0.0.1:4455/ 12 threads and 400 connections Thread Stats Avg Stdev Max +/- Stdev Latency 13.48ms 28.68ms 427.39ms 97.33% Req/Sec 3.48k 0.86k 12.52k 94.46% 206915 requests in 5.10s, 33.15MB read Requests/sec: 40573.53 Transfer/sec: 6.50MB

LIngMax commented 1 month ago

我自己写的多线程http空服务都有40w

多线程http代码


  $ags = Swoole\Thread::getArguments();
  if(!$ags){
          $map = new Swoole\Thread\Map;
          $arr = []; 
          for ($i=1; $i <=swoole_cpu_num()*4+1; $i++) { 
              $arr[$i] = new Swoole\Thread(__DIR__."/0.php",$map,$i);
          }
          sleep(3000);

  }

  $map = $ags[0];

  // 创建一个 TCP 套接字
  $socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
  // 设置 SO_LINGER 选项

  setOp($socket);
  // 绑定和监听
  socket_bind($socket, '0.0.0.0', 4455);
  socket_listen($socket,102400);
  $clients = [ spl_object_hash($socket) => $socket]; // 存储客户端套接字

  while (true) {
          $readSockets = $clients;
          $writeSockets = null;
          $exceptSockets = $clients;
          if (socket_select($readSockets, $writeSockets, $exceptSockets,null) === false) {
              echo "Socket select error: " . socket_strerror(socket_last_error()) . "\n";
              break;
          }
          // 检查是否有新的连接
          if (in_array($socket,$readSockets, true)) {
              $clientSocket = socket_accept($socket);
              if ($clientSocket !== false) {
                  $clients[spl_object_hash($clientSocket)] = $clientSocket;
                  setOp($clientSocket);
              }
          }
          // if(mt_rand(1,5000)===1)echo "clients: ".count($clients)  ."    readSockets: " .count($readSockets)."\n";
          // 处理现有连接的数据
          foreach ($readSockets as $key => $client) {
              if ($socket === $client) continue;
              // socket_set_block($client);
              // 接收数据
              $data = socket_read($client, 1024);
              if ($data === false) {
                  echo "读取数据失败\n";
                  continue;
              }
              $response = "HTTP/1.1 200 OK\r\n";
              $response .= "Content-Type: text/plain\r\n";
              $response .= "Content-Length: 1\r\n";
              $response .= "Connection: close\r\n";
              $response .= "\r\n";
              $response .= "1";
              // 发送响应
              $ret = @socket_write($client, $response, strlen($response));
              if ($ret === false) {
                  echo "发送失败\n";
              }
              // 关闭客户端连接
              @socket_close($client);
              unset($clients[spl_object_hash($client)]);
              // $map['queue_ret']->push(sendEncode('httpRet',$cid,$response));
              // if($data)$map['queue']->push(sendEncode('httpReq',spl_object_hash($client),$data));
      }

      foreach ($exceptSockets as $key => $client) {
            if ($socket === $client) continue;
            @socket_close($client);
            unset($clients[spl_object_hash($client)]);
      }

      // echo "0线程正在运行\n";
      // do {
      //     $recv = $map['queue_ret']->pop();
      //     if($recv === null ||!$recv)break;
      //     recvDecode($recv);
      // } while (1);
   }

function setOp($clientSocket){
    $lingerOption = [
        'l_onoff' => 1,      // 启用 SO_LINGER
        'l_linger' => 2      // 最大等待 5 秒
    ];
    socket_set_nonblock($clientSocket); // 设置新连接为非阻塞

    socket_set_option($clientSocket, SOL_SOCKET, SO_REUSEADDR, 1);#多进程多线程共用1个端口
    socket_set_option($clientSocket, SOL_SOCKET, SO_REUSEPORT, 1);#多进程多线程共用1个端口

    if(defined('SO_RCVBUF'))socket_set_option($clientSocket, SOL_SOCKET, SO_RCVBUF, 4*1024);## 接收缓冲区大小
    if(defined('SO_SNDBUF'))socket_set_option($clientSocket, SOL_SOCKET, SO_SNDBUF, 4*1024);## 发送缓冲区大小

    if(defined('TCP_NODELAY'))socket_set_option($clientSocket, SOL_TCP, TCP_NODELAY, 0);## 关闭Nagle算法
    if(defined('SO_KEEPALIVE'))socket_set_option($clientSocket, SOL_SOCKET, SO_KEEPALIVE, 1);## 开启keepalive
    if(defined('TCP_KEEPCNT'))socket_set_option($clientSocket, SOL_TCP, TCP_KEEPCNT, 10);## 保持连接10次
    if(defined('TCP_KEEPIDLE'))socket_set_option($clientSocket, SOL_TCP, TCP_KEEPIDLE, 10);## 保持连接10秒
    if(defined('TCP_KEEPINTVL'))socket_set_option($clientSocket, SOL_TCP, TCP_KEEPINTVL, 10);## 保持连接间隔10秒
    if(defined('SO_LINGER'))socket_set_option($clientSocket, SOL_SOCKET, SO_LINGER, $lingerOption);## 关闭连接 发送完数据

}
matyhtf commented 1 month ago
  1. 在正式发版前勿使用 swoole-cli ,直接使用 github master 分支代码
  2. 确保提交的测试代码是完整的,与你本地测试用的代码完全一致。不要精简,包括 echo 等任何操作,或许你删减的几行代码是关键的细节。
  3. 可增加 -k (KeepAlive) 测试,作为对比
  4. 代码段请使用 markdown code 格式提交,语言为 php ,参考上下文中我进行的编辑操作
LIngMax commented 1 month ago
  1. 在正式发版前勿使用 swoole-cli ,直接使用 github master 分支代码
  2. 确保提交的测试代码是完整的,与你本地测试用的代码完全一致。不要精简,包括 echo 等任何操作,或许你删减的几行代码是关键的细节。
  3. 可增加 -k (KeepAlive) 测试,作为对比
  4. 代码段请使用 markdown code 格式提交,语言为 php ,参考上下文中我进行的编辑操作

自己编译 php8.3.4+swoole6-dev(当天最新) 性能问题没有出现了
wrk -t12 -c400 -d5s "http://127.0.0.1:4455/"

阿里云 8h8g 突发性能机器 SWOOLE_PROCESS 6w /rps SWOOLE_THREAD 17w/rps

阿里云 64h128g SWOOLE_PROCESS 96w /rps SWOOLE_THREAD 129w/rps

其中64核高配机器报了一个错误 php: /root/swoole-src/src/server/master.cc:1364: int swoole::Server::send_to_connection(swoole::SendData*): Assertionfd % reactor_num == reactor->id' failed.

Aborted `

` 'reactor_num' => swoole_cpu_num()*4,#reactor线程数 1-4倍 默认值:CPU 核数

'worker_num'            => swoole_cpu_num()*4+1,#CPU核数的1-4倍 默认值:CPU 核数 Swoole 6多线程版本

`

改成 ` 'reactor_num' => swoole_cpu_num()*4,#reactor线程数 1-4倍 默认值:CPU 核数

'worker_num'            => swoole_cpu_num()*4,#CPU核数的1-4倍 默认值:CPU 核数 Swoole 6多线程版本

`

把reactor_num 和worker_num 改成一样就没报错了

至于性能问题是旧swoole代码 或者php8.1* 引起暂时未知