GalFe / lidgren-network-gen3

Automatically exported from code.google.com/p/lidgren-network-gen3
0 stars 0 forks source link

Many Clients make server unstable! #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Many Clients(>100) will make server unstable,some of clients will lost 
connection,and the server has high CPU Usage,almost 100%.

Original issue reported on code.google.com by isSin...@gmail.com on 26 Sep 2010 at 1:25

GoogleCodeExporter commented 9 years ago
Can you be more specific; perhaps using profiling? The ManyClients sample does 
not show a high cpu usage with lots of clients (despite running all clients + 
server on the same computer)

Original comment by lidg...@gmail.com on 26 Sep 2010 at 2:07

GoogleCodeExporter commented 9 years ago
uh~~,in my test case,Server and clinet run in different computer.when client's 
count arrived 150,Server(release mode, CPU:duo core 2.0G) use 50% CPU usage,and 
after some seconds,some clients begin lost connection,at last,only 49~50 
clients alive. This is my case.

Original comment by isSin...@gmail.com on 26 Sep 2010 at 2:15

GoogleCodeExporter commented 9 years ago
Are you using simulated loss; if so, how much? Are you using throttling? Is the 
line being saturated and the number of resends explode? Is it the network 
thread that uses 50% cpu or your app thread?

Original comment by lidg...@gmail.com on 26 Sep 2010 at 2:32

GoogleCodeExporter commented 9 years ago
forgot to say,i use ManyServer Demo,all parameters use default.
One way latency 0 to 0 ms,Loss 0% Duplicates 0% Throttle 262144 bytes/second.
Ping frequency 6000ms.
and delete all source code about "text out".such as:

case NetIncomingMessageType.Data:
    //string dstr = "Data from " + NetUtility.ToHexString(inc.SenderConnection.RemoteUniqueIdentifier) + ": " + inc.ReadString();
    //NativeMethods.AppendText(MainForm.richTextBox1, dstr);
        NetOutgoingMessage outMsg = Server.CreateMessage();
    outMsg.Write(dstr);
    Server.SendMessage(outMsg, Server.Connections, NetDeliveryMethod.ReliableOrdered, 0);

    break;

I watch the cpu usage in WinXP's Process Mananger,ManySever process,about 50%.

Original comment by isSin...@gmail.com on 26 Sep 2010 at 2:44

GoogleCodeExporter commented 9 years ago
Can you verify this is still the case with the new update?

Original comment by lidg...@gmail.com on 19 Oct 2010 at 7:07

GoogleCodeExporter commented 9 years ago
Thought i'd do my own stats on the performance of revision 141.
Just to set the test condition scene.
PC - Intel i7 CPU 960 3.2GHz, 12GB Ram
Used the barebones client/server samples
where i setup seperate threads for
each client in the client sample.
Turned off Verbose on both client and server.
Made maxconnections on server = 500
although it still connected with 400 when using its default!!!

I was alarmed that the CPU % was more than doubling
on the server per hundred of Concurrent Client Units.
I put this down to some aspect in the server not very
optimised around processing multiple connections.

At the rate in which CPU % was increasing the server would
be 1000 CCU at CPU 50% !!!!!

        CCU Memory KB       CPU(%)

Client      200 58500       3.88
Server          25300       0.36

Client      300 71156       4.43
Server          25716       1.60

Client      400 82644       8.0
Server              25356           5.8

Client      500 101000      38.0
Server          26380       10.5

Original comment by mark.js...@googlemail.com on 2 Nov 2010 at 7:07

GoogleCodeExporter commented 9 years ago
Are you running in RELEASE? Because DEBUG has ALOT of string handling which 
kills performance. Also; how are you testing the library, using the ManyClients 
app? If so, updating the UI for 500 windows are going to take its toll also...
That said; the code probably contains a few rough edges after the update; I'll 
try to profile it asap.

Original comment by lidg...@gmail.com on 3 Nov 2010 at 9:48

GoogleCodeExporter commented 9 years ago
I added some optimizations in rev 142 (and ManySample in 143) you might want to 
try. It won't make a world of difference, but should be a slight improvement.

Original comment by lidg...@gmail.com on 3 Nov 2010 at 10:41

GoogleCodeExporter commented 9 years ago
Oh... I just remembered... The ManyServer resends everything it receives to all 
other connections - meaning that when 100 clients are connected... the server 
pushes out 10000 (100x100) messages per second (until recently it was 20000 
since I changed the client to send once every second instead of twice). At 500 
the server pushes out 250000 messages per second. This would explain the high 
cpu usage. I'm goign to change the sample to better be able to test with larger 
amount of clients.

Original comment by lidg...@gmail.com on 3 Nov 2010 at 11:18

GoogleCodeExporter commented 9 years ago
This change should reduce cpu usage on several orders of magnitude - changing 
status to Fixed. 

Original comment by lidg...@gmail.com on 3 Nov 2010 at 11:21

GoogleCodeExporter commented 9 years ago
Sussed whats the issue !!!

Ive been running the server through Redgate ANTS profiler all morning and its 
shown where the problem is.  As you increase the channels it has an exponential 
impact on the server CPU loading.  So if you run with 10 channels the server 
hardly uses any CPU loading on the server.

Readings are taken on the Lidren Network Thread (Excluding the multi client 
usage).
Using Release builds

Client CCU  = 500   
Channels    = 99
Lidgren Network Thread CPU (%) = 25%
ANT Profiler Results
       NetPeer.NetworkLoop = 0.016%
       NetPeer.HeartBeat   = 10.591%
       NetConnection.HeartBeat = 77.383%

Client CCU  = 500   
Channels    = 10 (NetConstants.cs change)
Lidgren Network Thread CPU (%) = 0.02%
ANT Profiler Results
       NetPeer.NetworkLoop = 0.016%
       NetPeer.HeartBeat   = 1.845%(10 times smaller due to channel reduction)
       NetConnection.HeartBeat = 87.006%

Original comment by mark.js...@googlemail.com on 3 Nov 2010 at 11:37

GoogleCodeExporter commented 9 years ago
This is unlikely to be the case. The sample only uses one channel; and null 
channels are just skipped over. Did you sync in between? I recently fixed the 
sample (read above)

Original comment by lidg...@gmail.com on 3 Nov 2010 at 12:18