swoole / swoole-src

🚀 Coroutine-based concurrency library for PHP
https://www.swoole.com
Apache License 2.0
18.4k stars 3.16k forks source link

\Swoole\Http\Server非协程环境下('enable_coroutine' => false) deadlock提示 #4272

Closed dongzitai closed 3 years ago

dongzitai commented 3 years ago

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a simple script for reproducing the error.
    
    <?php

$map = [];

$serv = new \Swoole\Http\Server("127.0.0.1", 9502, SWOOLE_BASE, SWOOLE_SOCK_TCP);

$serv->set(array( 'worker_num' => 1, 'daemonize' => false, 'backlog' => 128, 'enable_coroutine' => false ));

$serv->on('Start', 'start'); $serv->on('WorkerStart', 'workerStart'); $serv->on('request', 'request'); $serv->start();

function start(Swoole\Server $server) { echo 'server 启动' . PHP_EOL; }

function workerStart(Swoole\Server $server, int $workerId) { echo 'worker启动' . $workerId . PHP_EOL; global $map; for ($i = 0; $i < 2; $i++) { $process = new \Swoole\Process(function (\Swoole\Process $process) { $i = 0; while (true) { sleep(1); echo '进程号: ' . $process->pid .' 计数: ' . ++$i . PHP_EOL; } }); $process->start();

    $processName = 'process'.$i;

    $map[$processName] = $process->pid;

    Swoole\Coroutine::create(function () use ($process) {
        $status = \Swoole\Coroutine\System::waitPid($process->pid);
        var_dump($status);
    });
}

}

function request($request, $response) { $response->end("

Hello Swoole. #".rand(1000, 9999)."

"); }


2. What did you expect to see?
不应该出现`deadlock`,请教出现`deadlock`的原因是什么

3. What did you see instead?
提示deadLock

=================================================================== [FATAL ERROR]: all coroutines (count: 1) are asleep - deadlock!

[Coroutine-1]

0 Swoole\Coroutine\System::waitPid() called at [/Users/dongzt/swoole-pro/amqp-test/dongzt/swoole-http-server.php:44]

在`workerStart`回调,用另一种方式就不会出现死锁提示,示例如下,
```php
function workerStart(Swoole\Server $server, int $workerId)
{
    echo 'worker启动' . $workerId . PHP_EOL;
    global $map;
    for ($i = 0; $i < 2; $i++) {
        $process = new \Swoole\Process(function (\Swoole\Process $process) {
            $i = 0;
            while (true) {
                sleep(1);
                echo '进程号: ' . $process->pid .' 计数: ' . ++$i . PHP_EOL;
            }
        });
        $process->start();

        $processName = 'process'.$i;

        $map[$processName] = $process->pid;

    }

    foreach ($map as $processName => $processId) {
        Swoole\Coroutine::create(function () use ($processId) {
            $status = \Swoole\Coroutine\System::waitPid($processId);
            var_dump($status);
        });
    }
}
  1. What version of Swoole are you using (show your php --ri swoole)?
swoole

Swoole => enabled
Author => Swoole Team <team@swoole.com>
Version => 4.6.3
Built => Feb 25 2021 23:29:51
coroutine => enabled with boost asm context
kqueue => enabled
rwlock => enabled
sockets => enabled
openssl => OpenSSL 1.1.1j  16 Feb 2021
dtls => enabled
http2 => enabled
json => enabled
curl-native => enabled
pcre => enabled
zlib => 1.2.11
brotli => E16777225/D16777225
mysqlnd => enabled
async_redis => enabled

Directive => Local Value => Master Value
swoole.enable_coroutine => On => On
swoole.enable_library => On => On
swoole.enable_preemptive_scheduler => Off => Off
swoole.display_errors => On => On
swoole.use_shortname => Off => Off
swoole.unixsock_buffer_size => 262144 => 262144
  1. What is your machine environment used (show your uname -a & php -v & gcc -v) ?
    
    arwin dongztdeMacBook-Pro.local 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64

PHP 7.4.19 (cli) (built: May 13 2021 06:28:47) ( NTS ) Copyright (c) The PHP Group Zend Engine v3.4.0, Copyright (c) Zend Technologies with Yasd v0.3.9-alpha, Our Copyright, by codinghuang with Zend OPcache v7.4.19, Copyright (c), by Zend Technologies

Apple clang version 12.0.0 (clang-1200.0.32.29) Target: x86_64-apple-darwin20.3.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin

sy-records commented 3 years ago

你的enable_coroutine是关闭了事件回调中的协程。而你waitPid是创建了一个新的协程,所以出现了这个报错和非协程环境下无关

huanghantao commented 3 years ago

Swoole内核清理Reactor(你可以理解为事件管理器)的时候会回调一个死锁检测的函数,如果当前还有协程没有退出(比如你这里的挂起,调用waitPid导致的协程挂起),会报这个deadlock的错误。 这里有两个关键的地方:

  1. 你这段程序什么时候会去清理Reactor 。目前,当调用Process::start这个函数fork子进程的时候,会在子进程清理从父进程“继承”下来的Reactor
  2. 什么时候会去注册这个死锁检测函数。目前,当创建一个协程的时候,会注册检测函数。

所以,你这段程序等价于:

<?php

use Swoole\Coroutine;

Swoole\Coroutine::create(function () {
    Coroutine::yield();
});

$process = new \Swoole\Process(function (\Swoole\Process $process) {
    $i = 0;
    while (true) {
        sleep(1);
        echo '进程号: ' . $process->pid .' 计数: ' . ++$i . PHP_EOL;
    }
});

$process->start();
$process->wait();
dongzitai commented 3 years ago

Swoole内核清理Reactor(你可以理解为事件管理器)的时候会回调一个死锁检测的函数,如果当前还有协程没有退出(比如你这里的挂起,调用waitPid导致的协程挂起),会报这个deadlock的错误。 这里有两个关键的地方:

  1. 你这段程序什么时候会去清理Reactor 。目前,当调用Process::start这个函数fork子进程的时候,会在子进程清理从父进程“继承”下来的Reactor
  2. 什么时候会去注册这个死锁检测函数。目前,当创建一个协程的时候,会注册检测函数。

所以,你这段程序等价于:

<?php

use Swoole\Coroutine;

Swoole\Coroutine::create(function () {
    Coroutine::yield();
});

$process = new \Swoole\Process(function (\Swoole\Process $process) {
    $i = 0;
    while (true) {
        sleep(1);
        echo '进程号: ' . $process->pid .' 计数: ' . ++$i . PHP_EOL;
    }
});

$process->start();
$process->wait();

基于我上文描述的场景,在创建完子进程后,新建的子进程会继承挂起的协程吗?在使用Coroutine::list()+Coroutine::getBackTrace($cid)统计新建的子进程中的协程信息时,发现有父进程挂起协程信息。

Swoole\Coroutine::create(function () use ($process) {
            $status = \Swoole\Coroutine\System::waitPid($process->pid);
            var_dump($status);
        });
huanghantao commented 3 years ago

是的,比如说父进程里面的协程个数,在子进程里面也会继承下来。

dongzitai commented 3 years ago

是的,比如说父进程里面的协程个数,在子进程里面也会继承下来。

只是协程的数量和堆栈信息吗?运行的协程应该不会被继承吧?我的意思是父子进程中同时运行两个一样的协程?

huanghantao commented 3 years ago

怎么说呢,这是一个很邪恶的问题了,你可以跑这个脚本自己体会下:

<?php

use Swoole\Coroutine;

$cid1 = Swoole\Coroutine::create(function () {
    Coroutine::yield();
    var_dump("process " . posix_getpid() . "coroutine ". Coroutine::getCid());
});

$process = new \Swoole\Process(function (\Swoole\Process $process) use ($cid1) {
    Coroutine::resume($cid1);
});

$process->start();

Coroutine::resume($cid1);

$process->wait();

你可以在子进程和父进程resume挂起的协程,但是,它们理论上来说已经不是同一个了。这你得从内存的角度去理解协程。

dongzitai commented 3 years ago

怎么说呢,这是一个很邪恶的问题了,你可以跑这个脚本自己体会下:

<?php

use Swoole\Coroutine;

$cid1 = Swoole\Coroutine::create(function () {
    Coroutine::yield();
    var_dump("process " . posix_getpid() . "coroutine ". Coroutine::getCid());
});

$process = new \Swoole\Process(function (\Swoole\Process $process) use ($cid1) {
    Coroutine::resume($cid1);
});

$process->start();

Coroutine::resume($cid1);

$process->wait();

你可以在子进程和父进程resume挂起的协程,但是,它们理论上来说已经不是同一个了。这你得从内存的角度去理解协程。

那就是像我刚才那个场景子进程中继承过来的挂起协程\Swoole\Coroutine\System::waitPid($process->pid); 永远都不会释放啦?。。

matyhtf commented 3 years ago

这里存在带协程 fork 的问题,如果全部是同步阻塞的,建议直接使用 Process::wait()

dongzitai commented 3 years ago

这里存在带协程 fork 的问题,如果全部是同步阻塞的,建议直接使用 Process::wait()

如果在worker进程中去管理进程(使用\Swoole\Process),经过测试不使用Process::wait()或者\Swoole\Coroutine\System::waitPid这个API,子进程并没有蜕变为僵尸进程,是因为worker持续运行不退出的原因,还是其他有其他原因,在worker运行的上下文,不使用Process::wait()或者\Swoole\Coroutine\System::waitPid这个API,会存在问题吗?

sy-records commented 3 years ago

@dongzitai 你可以加下我微信85464277沟通