Closed ven0ms99 closed 7 months ago
I have the same problem.
I restart every 6 hour the process but it's not a real solution and sometimes CPU usage goes really high
not a real answer but does changing to redis channelmanager give the same results? are there free file handles really available, check https://serverfault.com/a/448011 to monitor this, maybe you'l get a hint there
Some more insights: It is definitely related to the amount of usage. When I have many users using the service it becomes unavailable way faster. Also I went back to stable 1.X, because 2.*@dev didn't help at all actually.
not a real answer but does changing to redis channelmanager give the same results? are there free file handles really available, check https://serverfault.com/a/448011 to monitor this, maybe you'l get a hint there
I'll look into that and get back to you, thanks.
I restart every 6 hour the process but it's not a real solution and sometimes CPU usage goes really high
I used to do that. Now I have a cronjob doing a broadcast() every 10s and whenever I get the error BroadcastException: Failed to connect to Pusher
I immediately restart the service. Due to the amount of users I have at the moment it doesn't last 6 hours anymore, sometimes less than 1 hour. But sounds like you do have the same issue indeed.
Any updates?
@Hillcow Have you solved this problem? I've run into the same issue too, I've already set the max open files limit to 50000
In my test, using ls -1 /proc/{PID}/fd | wc -l
command, the opened file growth exponentially as the user increased overtime,
it shows 40000 opened files just in an hour!
and of course it lead to high cpu usage in a matter of time. The cpu and the opened files count won't ever go down until the worker (in supervisor) is restarted.
I've inspect the opened files every few minutes, and the broadcast will started to fail and return the "Failed to connect to Pusher" error when the opened files reach > 4000
I have this issue for years now on production. The past couple of days I've been really trying to get to the bottom of this.
Problem: After running websockets:serve (via supervisor or manually, doesn't matter) for a few hours, when I use
broadcast()
my server throws this error:Illuminate\Broadcasting\BroadcastException: Failed to connect to Pusher.
. Fact is that after running for a few hours users cannot connect at all to the service or they don't receive messages (because of the given exception) all while the websockets debug dashboard still shows some subscriptions coming in, but at a very slow rate (if I were to restart websockets:serve there would be a lot more going on). However, most of the time it is also not possible to even connect to the debug dashboard ("Channels current state is unavailable"). The command is still running and thus, supervisor is not restarting the process. Running on a powerful state of the art dedicated webserver. I guess concurrent connections at a maximum of around 500 (how can I monitor the number of concurrent connections?).on laravel-websockets 1. I noticed the behaviour starting when the process gets around 128-140mbit, but that must not necessarily be related. On 2.@dev it usually starts around 300-450mbit.
What I have already tried
My setup
open file limit set in limits.d to 64000.
.env:
composer.json:
broadcasting.php:
websockest.php:
/config/flare.php:
bootstrap.js:
The error my server throws after some time:
Each signed in user subscribes to a number of channels for private messages, replies, etc.
The solution I have used for years now, which is restarting websockets:serve every hour is not working for me any longer, because I'd like to implement a chat service and can't really restart the process this often any more without annoying a lot of users. Also, do you have any idea why the process is getting bigger and bigger? Looks like a memory leak to me.
I spent days on this issue now and I have reached a point where I do not know what else to try. I'd appreciate any tips or advise.
edit: might this actually be related to
QUEUE_CONNECTION=sync
? The thing is: the messages dont slow down slowly over time, but all of a sudden. Also: I have tried a different queue connection and then the messages were delayed heavily in general.