Open rmmoul opened 9 years ago
@i3bitcoin how many connections did you achieve?
@jupitern
More than 3k connections right now. It's the only solution worked for me.
I believe it's limited only with rlimit.
@i3bitcoin is there anyway i can contact you in personal about setting HHVM with ratchet chat? Im stuck with same 1024 connection limit.
@josephmiller2000 , possible this post may help you https://github.com/ratchetphp/Ratchet/issues/328#issuecomment-484266082
The main quick solution is to use any other Loop Event library instead of default React\EventLoop\StreamSelectLoop
@inri13666 Well, im using "even.so" event and tested with this method
https://github.com/ratchetphp/Ratchet/issues/300#issuecomment-318351931
Ev, is not detected by php, so right now using "event.so" instead of StreamSelectLoop
Increased all server side limits and php-fpm limits, still can't achieve more than 1024 at my peak time.
Users are in close_wait(socket) stage when they are connected to the chat.
So i planned to move to >> HHVM instead of basic php.
Easiest solution to this is to install ext-uv
and make sure you're running the latest react/event-loop
which has support for it. (And use the Factory::create()
method to get your event loop of course.)
Ok, could you please share the result
php -r "require_once 'vendor/autoload.php'; var_dump(\React\EventLoop\Factory::create());"
for my configuration it's
D:\_dev\sites\private\event-loop>php -r "require_once 'vendor/autoload.php'; var_dump(\React\EventLoop\Factory::create());"
Command line code:1:
class React\EventLoop\ExtEventLoop#3 (14) {
private $eventBase =>
class EventBase#4 (0) {
}
private $futureTickQueue =>
class React\EventLoop\Tick\FutureTickQueue#5 (1) {
private $queue =>
class SplQueue#6 (2) {
private $flags =>
int(4)
private $dllist =>
array(0) {
...
}
}
}
...
Here you go @inri13666
root@vps652855:# php -r "require_once 'vendor/autoload.php'; var_dump(\React\EventLoop\Factory::create());"
object(React\EventLoop\ExtEventLoop)#3 (11) {
["eventBase":"React\EventLoop\ExtEventLoop":private]=>
object(EventBase)#2 (0) {
}
["nextTickQueue":"React\EventLoop\ExtEventLoop":private]=>
object(React\EventLoop\Tick\NextTickQueue)#4 (2) {
["eventLoop":"React\EventLoop\Tick\NextTickQueue":private]=>
*RECURSION*
["queue":"React\EventLoop\Tick\NextTickQueue":private]=>
object(SplQueue)#5 (2) {
["flags":"SplDoublyLinkedList":private]=>
int(4)
["dllist":"SplDoublyLinkedList":private]=>
array(0) {
}
}
}
["futureTickQueue":"React\EventLoop\ExtEventLoop":private]=>
object(React\EventLoop\Tick\FutureTickQueue)#6 (2) {
["eventLoop":"React\EventLoop\Tick\FutureTickQueue":private]=>
*RECURSION*
["queue":"React\EventLoop\Tick\FutureTickQueue":private]=>
object(SplQueue)#7 (2) {
["flags":"SplDoublyLinkedList":private]=>
int(4)
["dllist":"SplDoublyLinkedList":private]=>
array(0) {
}
}
}
["timerCallback":"React\EventLoop\ExtEventLoop":private]=>
object(Closure)#9 (2) {
["this"]=>
*RECURSION*
["parameter"]=>
array(3) {
["$_"]=>
string(10) "<required>"
["$__"]=>
string(10) "<required>"
["$timer"]=>
string(10) "<required>"
}
}
["timerEvents":"React\EventLoop\ExtEventLoop":private]=>
object(SplObjectStorage)#8 (1) {
["storage":"SplObjectStorage":private]=>
array(0) {
}
}
["streamCallback":"React\EventLoop\ExtEventLoop":private]=>
object(Closure)#10 (2) {
["this"]=>
*RECURSION*
["parameter"]=>
array(2) {
["$stream"]=>
string(10) "<required>"
["$flags"]=>
string(10) "<required>"
}
}
["streamEvents":"React\EventLoop\ExtEventLoop":private]=>
array(0) {
}
["streamFlags":"React\EventLoop\ExtEventLoop":private]=>
array(0) {
}
["readListeners":"React\EventLoop\ExtEventLoop":private]=>
array(0) {
}
["writeListeners":"React\EventLoop\ExtEventLoop":private]=>
array(0) {
}
["running":"React\EventLoop\ExtEventLoop":private]=>
NULL
}
@josephmiller2000, I'm using socket server behind NGinX
worker_processes auto;
worker_rlimit_nofile 40000; # Important
events {
worker_connections 40000; # Important
multi_accept on; # Important
use epoll; # Important
}
server {
server_name _;
listen 8000 default_server;
listen [::]:8000 default_server;
root /home/site/wwwroot/web;
error_log /home/LogFiles/nginx-error.log;
access_log /home/LogFiles/nginx-access.log;
location ~ ^/ws(/|$)$ {
proxy_pass http://127.0.0.1:8080;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
}
Anyway figured out how to install ext-uv and all got up and working.
This is the maximum, connection i can get whatever events i use. Increased server limits and all done on my side. Even the script is using Zeromqnow.
with a cent os with 2gb ram, uv installed and a node socket client sending connections from other machine at my company we are reaching 20k connections. we just don't get more because all ram is in use.
node client => https://github.com/jupitern/node-socket-client
Just got hit with this and was able to eventually work around it. Wanted to share what all I went through in case it helps someone else down the line, because it took me two frustrating days with angry clients to resolve completely. For reference, we're running Ratchet with an Apache 2.4 reverse proxy on PHP 7.0, all running on Ubuntu 16.04. The Ratchet script is kept running by a supervisor task, ensuring that it restarts if it ever crashes. The Ratchet script is pretty straight forward; it interacts with an API on connection or when receiving certain messages, and contains a timer to hit the API for some data to send to specific clients (maintained by a user -> client map). Ratchet was maxing out at around 500 connections when we started.
First thing we noticed was Apache redlining both cores of the server. Ideally we'd move to a better server software like nginx, but our app currently prevents that. We also have to use a reverse proxy for SSL. We tried to use the underlying React library to run a WSS server directly without needing Apache/nginx, but weren't able to get it working correctly.
Bumping the server up to 4 cores gave enough resources to run Apache comfortably. From there we noticed that we'd still get 500 errors periodically, and some investigation into Apache revealed that it was tuned poorly and would cap out at a few hundred concurrent connections. Since the websockets count as a connection, these would quickly eat up available threads and prevent Apache from serving other traffic (other PHP scripts and static content). We were already using mpm_event, and updated our config to the following:
<IfModule mpm_event_module>
StartServers 10
MinSpareThreads 25
MaxSpareThreads 750
ThreadLimit 1000
ThreadsPerChild 750
# MaxRequestWorkers aka MaxClients => ServerLimit * ThreadsPerChild
MaxRequestWorkers 15000
MaxConnectionsPerChild 0
ServerLimit 20
ThreadStackSize 524288
Stress testing the server after this showed we could comfortably maintain thousands of requests a minute without any issue, which is well over what we needed to serve.
From there, we noticed that while Apache was running fine, the Ratchet script was now redlining with only a few hundred connections. Various searching led to the well documented StreamSelectLoop
issue. We ruled out LibEvent
due to using PHP 7.0, and weren't able to get LibUv
to install without errors, so settled on LibEv
with the following:
sudo pecl install ev
echo 'extension=ev.so' > /etc/php/7.0/mods-available/ev.ini
sudo phpenmod ev
sudo service php7.0-fpm restart # Not needed as the Ratch script is CLI, but better to see if this causes FPM issues now than later
Running a second instance of the Ratchet script that would initialize and then execute die(get_class($server->loop));
verified that the server was no longer running with a StreamSelectLoop
and instead using a ExtEvLoop
. We restarted the Ratchet script and let clients begin to auto-reconnect (our client side script will attempt to reconnect in increasing time per attempt), figuring we could watch the script as they reconnected for any performance issues. Everything ran fine, with the Ratchet script taking no more than 25% of a core until about 15 minutes later when it began to redline again. At this point, attempting to open a new connection would hang for a few minutes before failing.
We attempted to connect directly to the Ratchet script from the server itself (ie, bypassing Apache) to see if we could connect.
curl \
-o - \
--http1.1 \
--include \
--no-buffer \
--header "Connection: Upgrade" \
--header "Upgrade: websocket" \
--header "Host: localhost:8080" \
--header "Origin: http://localhost:8080" \
--header "Sec-WebSocket-Key: SGVsbG8sIHdvcmxkIQ==" \
--header "Sec-WebSocket-Version: 13" \
http://localhost:8080/
This would also hang and then fail. When we restarted the Ratchet script, we could use the above to connect immediately, but once it started redlining we could not. This indicated that Apache was fine, and the limit was on the Ratchet script.
We updated the script to output the number of connected clients on tick and restarted, which would get to 1017
and no higher. This was conspicuously close to 1024
, so we assumed it was some form of system limit. We checked the overall system limits using ulimit -a
and saw no issues. However, we checked the actual process limits with the following
ps aux | grep RatchetScript.php # Record PID from this command
cat /proc/<PID>/limits
and saw that it was soft limited to 1024 soft / 4096 hard max open files. Updating this with
prlimit --pid <PID> --nofile=500000:5000000
and checking the log verified that once these limits were raised, we were able to handle an additional several thousand connections, after which we could still connect via a browser to our app or via the cURL request above with no issue.
We figured this was a user-limit issue (the script does not run as the webuser) and updated /etc/security/limits.conf
with the user running the script and restarted, but saw that the limits were reset. We also attempted to run sudo su - ratchetuser -c 'ulimits -a'
to see if that neede to be updated for the user, but those also appeared fine. After some further digging, we came across an article saying the 1024 / 4096 limit is enforced by supervisor, after which we updated /etc/supervisor/supervisord.conf
with the following:
[supervisord]
....
minfds=500000
Restarting verified that the limits were maintained on the Ratchet script. The Ratchet script is now handling ~2,500 connections and using about 10% of one core, with small spikes here and there (mainly on client connection, as we have to decrypt connection data).
I imagine that the redlining occurs when Ratchet basically deadlocks waiting on a file handle that can't be created, but I haven't been able to verify this yet. It would explain the vast performance decrease once those connections are able to be properly created and maintained.
I had an experience which may help somebody. My server was stoping responding after one hour when the number of concurrent socket connections reached about 700. After doing all of the possible solutions, I realized that I had a ProxyPass in apache which redirects port 443 (SSL) to 8080 (my socket port). Finally, I increased the ServerLimit in my Apache prefork configuration from 700 to 1700 and the problem was solved temporarily. This shows that if you use ProxyPass of Apache (or another webserver) the Apache will become busy as it is between the client and WebSocket server.
I had that problem with ReactPHP, the core is deeper. Its nature rests in php methods of servicing socket events. Code was rewritten in cpp using epoll instead of select. stackoverflow
I started a project using ratchet, and wanted to test the number of connections that could be handled at one time on our server (Digital Ocean Ubuntu 14.04, 2 cores, 4GB ram running php 5.6.7 and apache2 2.4.7).
I followed some of the suggestions here on the deploy page http://socketo.me/docs/deploy to help increase the number of connections that could be handled, and seemed to get the ulimit and such to up the number of open files to 10,000.
I started running tests today using thor (https://github.com/observing/thor):
I got a php error when the number of connections exceeded 1024:
I was actually using php 5.5.9 at the time, so I followed some old instructions from http://ubuntuforums.org/archive/index.php/t-2130554.html and increased the FD_SETSIZE value to 10000 in the following two files and then downloaded and compiled php 5.6.7.
That coupled with using this command to run the server through supervisor:
Seems to have allowed the number of connections to go beyond 1024, but now it causes a buffer overflow within php, showing this error in the log file before restarting the process:
I'm curious how other users are getting beyond 1024 concurrent connections, whether some of you have never hit this limit at all (could you share your environment details), or made certain changes to get beyond it (could you share what changes you've made)?