Closed dimon777 closed 12 years ago
You didn't explain what happens behind the scenes when we see this message. In fact on a client I don't see any errors. For all connection attempts I'm getting: "NetConnection.Connect.Success" status code in the NetStatusEvent.NET_STATUS event listener. How am I going to detect if my connection is bad then?
Thanks, D.
No, when this error happens, you receive nothing on client side (no error, and no NetConnection.Connect.Success event). Then, after 2 minutes, you will receive a NetConnection.Connect.Failed event, but you can not wait all this time! not acceptable! So as explained in the wiki (FAQ), you have to "reattempt" the connexion after a short timeout if nothing happen (after 10 sec for example).. And as written in the wiki, it's a very rare situation.. no other solution actually.
Ok, I got it. It's still unclear to me the nature of this issue. Anyway... I've tried your suggestion to reconnect after 10 sec delay and it only works partially. In this example I'm opening 450 sessions to Cumulus. All the time I get 20-23 failed connections. So, after 10 sec, I clear them up, create new ones and reconnect them. This works only partially and I can re-connect about 5-6 connections out of 22. The rest is still gives Decrypt error and I'm not able to clear them all. This means it's an issue w/o workaround. Can you confirm this?
Thanks, D.
Here is the relevant code. (Can be simplified by removing all the listeners)
private var arr:Array = new Array(450);
/** Driving function */
public function bulkConnect():void {
for (var i:int=0; i<arr.length; i++) {
makeConnect(i);
}
flash.utils.setTimeout(checkBulkConnect,10000);
}
/** Makes i-th connection to cumulus*/
private function makeConnect(i:int):void {
var cc:NetConnection = (arr[i] as NetConnection);
if (cc) {
cc.removeEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError);
cc.removeEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus);
cc.removeEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
cc.removeEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity);
cc.close();
cc=null;
}
var ob:Object = new Object();
ob.id=i;
var nc = new NetConnection();
nc.addEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError);
nc.addEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus);
nc.addEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
nc.addEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity);
nc.client=ob;
nc.connect(rtmfpURL);
arr[i] = nc;
LOG.debug("Created NetConenction: {0}", i);
}
/** Checks connections for success, if not recursively tries to re-connect failed */
private function checkBulkConnect():void {
for (var j:int=0; j<arr.length; j++) {
var nc:NetConnection = (arr[j] as NetConnection);
if (!nc.connected) {
LOG.error("Connection status FAILURE for nc: {0}, reconnecting...", nc.client.id);
makeConnect(j);
}
}
var failCnt:int=0;
for (var v:int=0;v<arr.length;v++) {
if (!(arr[v] as NetConnection).connected) {
failCnt++;
}
}
if (failCnt>0) {
LOG.error("Failed connection count: {0}", failCnt);
flash.utils.setTimeout(checkBulkConnect,10000);
}
}
This test can not work like that dimon.
You are creating a denial-of-service attack, without managed the buffer sending/receiving buffer of your server.
First thing to do: Set an important value for your "udpBufferSize" server parameter (see "Configuration" part of https://github.com/OpenRTMFP/Cumulus/wiki/Installation page). I thing that for 450 instantaneous connections 400Ko whould be enough (udpBufferSize=400000).
Second thing to do: Beware the execution context! Your machine have to support the load of 450 connections, on client side and server side. I don't know if the server and the clients are running on the same machine during your test, and if your machine has high performance (in real case, clients and the server are on different machine, of course). If you are a little light on it, increase the 10 sec timeout by a 20 sec timeout, and retest. Otherwise you risk of closing some good connections which are not finished in fact.
I have tested your code, with these corrections, and I confirm that in average we get 0.5% of fail.
I'm running Cumulus in stanalone CentOS Linux box, and my client on 4GB laptop Intel Core i5. So I don't think I saturate any resources yet. It's not flooding where the issue is. For example, here is another code, which uses timer to spawn connections. Same result irrelevant if I do 50, 100, 250 or 500 ms between connections. There always failures in range of 16-23 connections. I've also tried changing buffer to 400K or 500K. Ive tried changing timeout for re-connections to 15sec, 20sec, 30 sec, 40sec. Always the same consistent result - Only 2-3 failed connections are able to recover but others are not. The concern is not even failures per se, rather that it's not possible to recover from them. No matter how many times I retry a failed connection, I'm not able to make it reconnect again! So, 16 failures per 450 connections it's more like 4% of failures, not 0.5%. I can't see how this can be solved as of now. If you have your example, which shows that it's possible to recover form failures, please post it. Here is my modified test:
private var arr:Array = new Array(450);
private var counter:int = 0; // Connection counter
private var t:Timer = new Timer(500, arr.length); // Timer to create connections
private var checkTimer:Timer= new Timer(15000, 0); // Timer to check connection status
private function onTimer(e:TimerEvent): void {
makeConnect(counter++);
}
/** Driving function */
public function bulkConnect2():void {
t.addEventListener(TimerEvent.TIMER, onTimer);
t.start();
checkTimer.addEventListener(TimerEvent.TIMER, onCheck);
checkTimer.start();
}
private function onCheck(e:TimerEvent): void {
checkBulkConnect();
}
/** Makes i-th connection to cumulus*/
private function makeConnect(i:int):void {
var cc:NetConnection = (arr[i] as NetConnection);
if (cc) {
cc.removeEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError);
cc.removeEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus);
cc.removeEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
cc.removeEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity);
cc.close();
cc=null;
}
var ob:Object = new Object();
ob.id=i;
var nc = new NetConnection();
nc.addEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError);
nc.addEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus);
nc.addEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
nc.addEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity);
nc.client=ob;
nc.connect(rtmfpURL);
arr[i] = nc;
LOG.debug("Created NetConenction: {0}", i);
}
/** Checks connections for success, if not recursively tries to re-connect failed */
private function checkBulkConnect():void {
var failCnt:int=0;
for (var j:int=0; j<arr.length; j++) {
var nc:NetConnection = (arr[j] as NetConnection);
if (!nc)
break;
if (!nc.connected) {
failCnt++;
LOG.error("Connection status FAILURE for nc: {0}, reconnecting...", nc.client.id);
makeConnect(j);
}
}
if (failCnt>0) {
LOG.error("Failed connection count: {0}", failCnt);
}
}
You ask me too much time, I can not follow, sorry (maybe one other user will answer). I have taken one hour yesterday to test it in all directions, your test was conceptually wrong, without change I have gotten the same thing than you, and after corrections and configurations I have gotten right result. As soon as I get free time, I will look it again, but it's not before some weeks.
2012/2/10 dimon777 < reply@reply.github.com
I'm running Cumulus in stanalone CentOS Linux box, and my client on 4GB laptop Intel Core i5. So I don't think I saturate any resources yet. It's not flooding where the issue is. For example, here is another code, which uses timer to spawn connections. Same result irrelevant if I do 50, 100, 250 or 500 ms between connections. There always failures in range of 16-23 connections. I've also tried changing buffer to 400K or 500K. Ive tried changing timeout for re-connections to 15sec, 20sec, 30 sec, 40sec. Always the same consistent result - Only 2-3 failed connections are able to recover but others are not. The concern is not even failures per se, rather that it's not possible to recover from them. No matter how many times I retry a failed connection, I'm not able to make it reconnect again! So, 16 failures per 450 connections it's more like 4% of failures, not 0.5%. I can't see how this can be solved as of now. If you have your example, which shows that it's possible to recover form failures, please post it. Here is my modified test:
private var arr:Array = new Array(450); private var counter:int = 0; // Connection counter private var t:Timer = new Timer(500, arr.length); //
Timer to create connections private var checkTimer:Timer= new Timer(15000, 0); // Timer to check connection status
private function onTimer(e:TimerEvent): void { makeConnect(counter++); } /** Driving function */ public function bulkConnect2():void { t.addEventListener(TimerEvent.TIMER, onTimer); t.start(); checkTimer.addEventListener(TimerEvent.TIMER,
onCheck); checkTimer.start(); }
private function onCheck(e:TimerEvent): void { checkBulkConnect(); } /** Makes i-th connection to cumulus*/ private function makeConnect(i:int):void { var cc:NetConnection = (arr[i] as NetConnection); if (cc) {
cc.removeEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError);
cc.removeEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus);
cc.removeEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
cc.removeEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity); cc.close(); cc=null; } var ob:Object = new Object(); ob.id=i; var nc = new NetConnection(); nc.addEventListener(AsyncErrorEvent.ASYNC_ERROR, netConnectionAsyncError); nc.addEventListener(NetStatusEvent.NET_STATUS, netConnectionStatus); nc.addEventListener(IOErrorEvent.IO_ERROR, netConnectionError);
nc.addEventListener(SecurityErrorEvent.SECURITY_ERROR, netConnectionSecurity); nc.client=ob; nc.connect(rtmfpURL); arr[i] = nc; LOG.debug("Created NetConenction: {0}", i); }
/** Checks connections for success, if not recursively
tries to re-connect failed */ private function checkBulkConnect():void { var failCnt:int=0; for (var j:int=0; j<arr.length; j++) { var nc:NetConnection = (arr[j] as NetConnection); if (!nc) break;
if (!nc.connected) { failCnt++; LOG.error("Connection status
FAILURE for nc: {0}, reconnecting...", nc.client.id); makeConnect(j); } }
if (failCnt>0) { LOG.error("Failed connection count: {0}",
failCnt); } }
Reply to this email directly or view it on GitHub: https://github.com/OpenRTMFP/Cumulus/issues/56#issuecomment-3901054
Do you think that the second test concept is wrong too?
When running simple load test for Cumulus for 3000 simultaneous connections. I'm seeing these messages in the log:
07/02 23:15:20.2 ERROR RTMFPServer(1115679040) Session.cpp[50] Decrypt error on session 334 07/02 23:15:28.8 ERROR RTMFPServer(1115679040) Session.cpp[50] Decrypt error on session 132 07/02 23:15:30.0 ERROR RTMFPServer(1115679040) Session.cpp[50] Decrypt error on session 247 07/02 23:15:30.2 ERROR RTMFPServer(1115679040) Session.cpp[50] Decrypt error on session 334
Is this a problem?