Closed ozgurhangisi closed 5 years ago
Hmm.. My swoole server runs same time to exactly test such things, but there were no problems.
Thanks for that, i will have a longer run :+1:
btw: https://wiki.swoole.com/wiki/page/p-LinuxSignal.html https://wiki.swoole.com/wiki/page/172.html
Thanks for the link. It seems like error related with file descriptor (according to the link you sent). It generally happens at night between 03:00 AM to 06:00 AM. We don't have so much traffic at night. I checked open file count on swoole server with lsof | wc -l command. At pick time it's around 2000. But when we had problem it was 1413. I have 3 swoole server. 3 of them gets (almost) equal traffic from load balancer but some of them stops in 2 days. Some of them stops in 1 week. It's not a stable error and everything is look like normal on the server.
We have ~60 nginx php-fpm server and we are trying to convert them to swoole but we couldn't solve this problem and we couldn't find the reason.
By the way this error always happens on swoole servers. If you want me to check something on the broken server so I can send the information you need.
If it is possible to you, this issue template https://github.com/swoole/swoole-src/issues/2000 is for such cases to help to provide neccessary information and get faster support by swoole team :)
signal 9 is SIGKILL, it's not a bug, probably your mistake. if any other process kills your swoole manager server? check it, it may be your code error.
Hi,
In the code there is no linux command signal send. It just connect to memcached, redis, db, etc... and sends the results.
It's a standart aws server I only installed php72 and some extensions. I will check if any other process send SIGKILL to swoole.
Thanks.
Can we catch it? Maybe, this would help?
@flddr
There are two signals which cannot be intercepted and handled: SIGKILL and SIGSTOP. https://en.wikipedia.org/wiki/Signal_(IPC)#SIGKILL
if you can catch it and ignore it, you may never kill this process.
I checked my code again and there is no signal command. Here is how I start the swoole server. May be you can see anything that I should fix :
$this->wisObjects['webserver']['obj'] = new swoole_http_server('127.0.0.1', 80); $this->wisObjects['webserver']['obj']->set(['log_file'=>'/var/log/wisswoole']); $this->wisObjects['webserver']['obj']->set(['worker_num'=>1]); $this->wisObjects['webserver']['obj']->set(['open_tcp_nodelay'=>true]); $this->wisObjects['webserver']['obj']->set(['daemonize'=>2]); swoole_async_set([ "enable_reuse_port" => true, ]); $this->wisObjects['webserver']['obj']->on('request',[$this,'httpRequest']); $this->wisObjects['webserver']['obj']->on('workerstart',[$this,'httpWorkerStart']); $this->wisObjects['webserver']['obj']->on('workerstop',[$this,'httpWorkerStop']); $this->wisObjects['webserver']['obj']->start();
I will install same code to the machine with 2GB RAM with 2 CPU. it currently runs on the machine 1GB RAM with 1 CPU. Our system gets many traffic. So may be RAM or CPU is not enough for the processes.
I will let you know in a week if it's related with server's RAM or CPU.
Swoole is a life saver solution for the companies who work with php and gets thousands of request in second. Our performance test results on swoole server are so good and I really want to use it. Sorry for disturbing you too much and thanks for the fast responses. We love Swoole :)
@ozgurhangisi @flddr sorry, I have not noticed that there is signa11, Is your swoole is the latest version? like @flddr says, read https://github.com/swoole/swoole-src/issues/2000#issuecomment-423807053 trace your core file or try to use valgrind. Due to insufficient English documentation, I know many of you are using asynchronous APIs, but it's not the most advanced way, asynchronous's stability is slightly worse than coroutine, I just wanted to let you know that, I guess this is a possible reason.
Swoole version is 4.2.1 Yes there is many signal=11 in the log file but it starts with signal=9 and continue with signal=11 errors. I see same things in the log file in 3 swoole server.
I installed swoole with yum remi package but I will try to do the steps in the link.
@ozgurhangisi i guess signal 9 is done by os because of ressources, while signal 11 is segfault, which is interesting for twose for correct handling. So if you follow #2000 for coredump, they can preserve it in this special case :+1:
Hi,
Good news, It's been 6 days and swoole servers works well on 2 GB RAM servers. As flddr said, It can be related with the server resources. I want to wait for 1 more week and I will let you know if swoole servers are running correctly or stops working.
Thanks, Ozgur.
@ozgurhangisi thats great :+1: only if possible and ok to you, by time, you can do the coredump with smaller ressources to help swoole team to find this bug of
WARNING swManager_check_exit_status: worker#0 abnormal exit, status=0, signal=11
because this is a segfault :)
Have a look here if you want to help: https://github.com/swoole/swoole-src/issues/2002#issuecomment-423932192
I just wanted to add extra information for the people who has the same problem. Because it took 4 months for me to solve this problem. :( I just updated to php 7.2.12 and problem solved. Probably it's not related with swoole. I believe that problem can be related with https://bugs.php.net/bug.php?id=76846
Thanks for your help guys.
Please answer these questions before submitting your issue. Thanks!
Swoole server works well for a while. I don't know why but It immediately stop responding. I checked CPU usage, memory, open File descriptor count,request count,etc... it seems like nothing is wrong. I get the errors below :
[2018-10-02 01:03:38 $1972.0] WARNING swManager_check_exit_status: worker#0 abnormal exit, status=0, signal=9 Error. After this error I get many WARNING swManager_check_exit_status: worker#0 abnormal exit, status=0, signal=11 errors For around 3 hours. The last message is WARNING swServer_signal_hanlder: Fatal Error: manager process exit. status=0, signal=9.
When I restart the server everything is normal for 3-4 days and it happens again.
I expect swoole server to run properly.
Error messages in the log file.
php --ri swoole
)?swoole support => enabled Version => 4.2.1 Author => Swoole Group[email: team@swoole.com] coroutine => enabled trace-log => enabled epoll => enabled eventfd => enabled signalfd => enabled cpu affinity => enabled spinlock => enabled rwlock => enabled sockets => enabled openssl => enabled http2 => enabled pcre => enabled zlib => enabled mutex_timedlock => enabled pthread_barrier => enabled futex => enabled mysqlnd => enabled redis client => enabled
Directive => Local Value => Master Value swoole.enable_coroutine => On => On swoole.aio_thread_num => 2 => 2 swoole.display_errors => On => On swoole.use_namespace => On => On swoole.use_shortname => On => On swoole.fast_serialize => Off => Off swoole.unixsock_buffer_size => 8388608 => 8388608
AWS Linux t2.micro with 1 GB RAM.