Closed mrAndersen closed 10 months ago
@mrAndersen The server runs using SWOOLE_BASE, when the worker process has a fatal error and crash, the connection num will be inaccurate, which has been fixed in the latest
https://github.com/swoole/swoole-src/commit/a71a03455cdefd803c8fc40b8688998c5edfb6f0
@matyhtf Maybe this is not a cosmetic defect, but a real connections leak(zombies)?
In my case, this is exactly what happened, and I also checked it in version swoole 5.1.0-dev.
The only thing I was able to find is that if a message from the server buffer (for example, a websocket) in base mode was not completely sent to the client and at that moment the connection was closed not by the client or the server, then the server calls the close callback, but not closes the socket and puts unsent data in an infinite queue waiting to be sent. This happens at this point in the code: https://github.com/swoole/swoole-src/blob/2edc4d99dc35f083751f669fb12d335ec9e3b301/src/server/base.cc#L155-L163
And here’s what I also noticed while tracking the path to this point, along the way to this point this function is called: https://github.com/swoole/swoole-src/blob/2edc4d99dc35f083751f669fb12d335ec9e3b301/src/server/worker.cc#L158
that is a function in: https://github.com/swoole/swoole-src/blob/2edc4d99dc35f083751f669fb12d335ec9e3b301/src/server/base.cc#L87
I don’t know if it’s a bug or not that the int flags variable is replaced with false, but replacing false with a specific value could disable the infinite wait for sending after an unknown connection break and thereby eliminate the socket leak. In general, I still don’t understand why a deferred closed connection buffer is needed, perhaps to serve http connection closed requests, but..... after all, control over the socket is lost. In my case, the server setting 'heartbeat_idle_time' => 60 helped eliminate the connection leak, which actually timeouts the zombie connections after they are closed :) But I believe that this is not an entirely ideal solution to combat these zombie sockets, since the inactive connection will also be disconnected after a while.
@mrAndersen @XDRiVE888 Hi. I have made a pr about it, see https://github.com/swoole/swoole-src/pull/5149
@NathanFreeman Thank you, I tested it myself and this problem really disappeared, at least for me)
I am getting this stats on production![image](https://github.com/swoole/swoole-src/assets/2115147/7a966b32-01d0-4f19-b9c6-fde093377821)
bff_swoole_connection_num = connection_num metric from http server stats() method and other metrics similarly. Swoole version = 5.0.3 Also I am getting this error
[FATAL ERROR]: all coroutines (count: 1) are asleep - deadlock!
Server started with following parameters: