Closed voxyapi-demo closed 8 years ago
Sorry i was using the wrong profile...
Hi Jonathan! I have applied the database to controll all the online visibility between users as we talk back, now im getting this errors when i run the app...
1439928974153 - Origin: Worker (PID 5220)
[Notice] Error: Socket Error: Client pong timed out
at Emitter.SCSocket._onSCClose (X:\node_modules\socketcluster\lib\server\scsocket.js:173:17)
at null.
and after some time
1439929225921 - Origin: Broker (PID 4936) [Error] Error: This socket is closed. at Socket._writeGeneric (net.js:656:19) at Socket._write (net.js:709:8) at doWrite (_stream_writable.js:301:12) at clearBuffer (_stream_writable.js:411:7) at onwrite (_stream_writable.js:344:7) at Socket.WritableState.onwrite (_stream_writable.js:105:5) at WriteWrap.afterWrite (net.js:787:12) 1439929228725 - Worker 2 exited - Exit code: 0
Some tips?
@waltergms Did this issue happen in production?
Notices like the first one can happen in production if a client's computer goes to sleep for example (their connection could timeout without closing the connection properly) - In this case this is normal and you should expect some of these while running in production if you have lots of users. If this issue happened during testing, I would like to know what steps you used so that I can reproduce it.
The second one (Error) means that you are trying to write to a socket (which you have a reference to) after it has been disconnected. If you listen to socket.on('disconnect', handler)
and make sure that you don't try to write to the socket again after this then you shouldn't get the error - If you do then that's a bug that needs to be fixed.
If you want to be really safe, you can check if (socket.state == socket.OPEN)
before emitting events to a socket.
Let me know if this helps.
I will make some changes in order to treat these potential openings that may be sending to sockets that have already been closed, but I have some fear that these closed sockets are receiving the events that are published by "socket.global.publish" because I always use this method to send a message to a specific user or to a group (room) users. On the client side, the event "socket.disconnect" I always do a loop in the channels the user is subscribed and remove them with the command "channel[x].unsubscribe(userRooms [y])" and that channel is an instance of the channel (channel [x] = socket .subscribe (userRooms [x]). Taking advantage to take a doubt ... there is another method of sending an event to a particular user without using global.publish?
@waltergms Yes, you can keep track of socketIds and users (e.g. keep track of them using an Object), but this approach is likely to become a problem as soon as you have to add a second worker or host - I would generally recommend against it but it depends on the use case.
You're not using socket.emit()
from the server side at all?
If not then I will investigate socket.global.publish
to try to identify if there is an issue there.
Im using socket.emit only in cases that i have to return something to the same person that call the server for some reason, eg.: "make a socket.emit to the server to get some info for myself".. all the communications between user to user OR server to users im using global.publish
Interesting. In theory it may be possible that a client could disconnect just before the server sends a response to the event from the client.
@waltergms I tried different things like turning off my connection (and intentionally causing pong timeout) before doing a publish and also just before responding to a client event but I wasn't able to reproduce the second error.
If you're getting the second error in production, it shouldn't affect your users - Basically, what is happening is that the SC server is trying to send a message to a socket which has already been closed.
If this error is triggered by SC internally (for example by global.publish(...)), then that's something I would like to fix - Can you find a way to reproduce this issue in a simple environment or it only happens in production?
Also, what is your client socketCluster.version
? And your server version?
This happens on production only... Im making some code changes, hopefully i will get some better understanding of what is causing the problems... about the versions used, i belive im using the latest... "socketcluster": "2.3.x", "socketcluster-client": "2.3.x",
Do you have any code running in the broker.js file?
No.. just the default code..
module.exports.run = function (broker) { console.log(' >> Broker PID:', process.pid); };
In my test environment im getting disconnected after some seconds... :|
Did you change the pingInterval or pingTimeout options?
Yes... var socketCluster = new SocketCluster({ workers: Number(argv.w) || 4, brokers: Number(argv.b) || 1, port: Number(argv.p) || 8090, appName: argv.n || null, workerController: dirname + '/worker.js', brokerController: dirname + '/broker.js', socketChannelLimit: 50, rebootWorkerOnCrash: argv['auto-reboot'] != false, pingTimeout: 5000, pingInterval: 2000 });
That looks OK
using that ping options i need to put in the code that scheme of pinging the client time to time?
var interval = setInterval(function () {
socket.emit('rand', {
rand: Math.floor(Math.random() * 5)
});
}, 1000);
You don't have to ping explicitly - SC handles the pinging for you.
What probably could cause the client disconnection?
So long as the pingInterval is less than half of the pingTimeout it should be fine.
I noticed in your original error that it's coming from a broker process... Also it looks like it's causing the worker to restart - I think this is different than the issue I was thinking about.
Looks like the disconnection problem has been fixed... :+1:
What did you do?
yes... the worker is restarting... but i don't have implemented nothing on brokers...
Just restart the server... in my environment...
Maybe a socket was trying to reconnect while you were restarting it in your test env... Not sure.
Regarding the original issue (in production) - You should check CPU and memory consumption of all processes to see if maybe there is a memory leak or another issue which is causing the OS to kill a process.
I need to go now. Let me know if you can reproduce it or if you find the cause.
1440015569109 - Origin: Worker (PID 7580)
[Error] Error: Failed to unbind socket from io cluster - nData Error - The u
nsubscribe action timed out
at X:\node_modules\socketcluster\lib\server\scserver.js:235:32
at bound (domain.js:254:14)
at runBound (domain.js:267:12)
at X:\node_modules\socketcluster\node_modules\iocluster\index.js:608:5
at X:\node_modules\socketcluster\node_modules\iocluster\node_modules\async\l
ib\async.js:428:21
at X:\node_modules\socketcluster\node_modules\iocluster\index.js:725:19
at EventEmitter.
The scenary is: 24k channels 1.3k users online
Eventually the client receive a socket hangup, i don't know why the connection is losing the communication with the server...
Uncaught Error: Socket hung up [ socketcluster.js:685 ]
@waltergms How many broker processes do you have and how busy are they? Can you see using the top
command? You should try to keep them below 60% CPU usage in case of spikes.
That looks like something which might happen if a broker gets so busy that it can't keep up relaying channel data between your workers.
Your app seems to make heavy use of channels so you should try to boost the number of brokers. When you launch your instance, you can pass the -b flag: E.g. node server -b 2
(this will launch 2 brokers) - The more brokers you have, the less work each one will have to do.
I will try to simulate a similar scenario (with lots of channels) later today to see if I can reproduce the issue.
@waltergms Ok, I was able to reproduce the exact same errors by blocking the nData server (broker process) for 10 seconds using a while
loop while a client is trying to unsubscribe. This suggests that it's probably due to a broker maxing out on CPU usage.
Good morning Jonathan, I'm working vacation starting today, my colleague will make the necessary adjustments, it should continue testing and to contact you today. Thank you very much for your attention, i hope we can get this working 100%
Good morning Jonathan, I am a colleague of Walter. I made several adjustments and the server is behaving right now, the only problem I still have is the memory consumption. I believe that a garbage collector solve, but when I try to use the "global.gc ()", an exception is thrown and random way.
How do I use a garbage collector with socketCluster?
@welberty You need to provide a special --expose-gc command line argument to worker processes in order to be able to invoke global.gc()
.
This wasn't possible before so I fixed this in version 2.3.13. I created a new issue in case anyone else is wondering about the same thing: https://github.com/SocketCluster/socketcluster/issues/105
Workers and brokers will now inherit any CLI execArgv from the master process so you can also have --nouse-idle-notification
and/or --harmony
flags or any other valid flag.
Thanks for reporting this issue.
Thank's for your attention. With the update of the socketCluster global.gc () worked fine.
I have another problem. I use a table in the database to store the socket IDs of users, so that I know which users are online. In evenrto disconnect socket remove this table and put the User and offline. It turns out that due to some errors Evente disconnect the socket is not raised, below one of reccorentes errors.:
Tue Sep 01 2015 11:38:31 GMT-0300 (E. South America Standard Time): Error: Socket Error: Socket hung up
at Emitter.SCSocket._onSCClose (Y:\node_modules\socketcluster\lib\server\scsocket.js:173:17)
at WebSocket.
@welberty This error is normal and you should expect it to happen fairly often if your system has lots of users - It's a socket error so it won't crash the worker (just log it, but don't worry too much about it unless you get a really massive number of these - Then it might mean that the internet connection between your server and your clients is bad for example).
It just means that a socket was closed suddenly without first sending a close control frame (maybe the user's computer/browser/internet crashed/froze suddenly). You should still get a 'disconnect' event on the server's socket object though.
To reproduce the 'Socket hung up' error, I have the client running in Node.js in the command line and I connect it to a SC server. Then I just kill the client abruptly using Ctrl+C - This triggers a 'Socket hung up' error on the server as expected because the client did not send a close packet. In my case, I did get a 'disconnect' event on the server socket object.
Can you find out what is triggering this error in your case and double check that the 'disconnect' event is not triggered on the socket object on the server?
Note that sometimes, if the hung up is due to a bad internet connection, the 'disconnect' event might trigger only after a ping timeout (20 seconds) is reached but the hung-up event should appear at the same time on the server.
Hai Team, I am new for socketcluster. Server SC version : 3.0.0 Client SC version : 3.1.1
i got this below this issue
1452426873250 - Origin: Worker (PID 2641)
[Notice] Error: Socket Error: Client pong timed out
at Emitter.SCSocket._onSCClose (/Users/safi/Projects/Server/hm-chat/node_modules/socketcluster-server/scsocket.js:174:17)
at null.
kindly suggest to me
@safiresh Try to use a client and server with the same version number.
...