Implement sliding window scale logic.

lzaiats commented 7 years ago

I have been testing LiteNetLib, Lidgren-gen3 and Lidgren-old (the one from Google Code) and Lidgren-old is beating the competitors by far... On my test I have 750 clients connected to a single server instance, receiving ~15 reliable msgs/s and ~100 unreliable msgs/s and Lidgren-old can handle all this data perfectly! Using the same "game code" but using Lidgren-gen3 or LiteNetLib the test "fails" between 100 and 120 clients... Maybe I am missing something with LiteNetLib API, so I can achieve better performance... Do you have any idea where I am doing wrong with your Lib? If you need I can share the code :)

My test is using my area of interest management layer, so it can efficiently send to clients only data important to them. Only using this technique I was able to achieve 750 clients on a single server :) Every client is also sending ~15 unreliable msgs/s (movement messages)

Thanks and keep the awesome work!

RevenantX commented 7 years ago

@lzaiats Hi!

? If you need I can share the code :)

Can you show minimal sample code that works slower than Lidgren-old?

lzaiats commented 7 years ago

Basically it is UDP congestion the problem.. Lidgren-old appears to drop less packets than LiteNetLib, for example... I will reproduce a little sample code, with LiteNetLib, Lidgren-gen3 and Lidgren-old so you can see the "thing" on your local machine ;)

BTW, i've implemented a custom NetSerializer that use MessagePack (the optimized from neuecc) for serialization and it works great! Maybe it can be usefull to other LiteNetLib users ;)

lzaiats commented 7 years ago

Hi @RevenantX !

Here is the simple benchmark I made... It sends reliable and unreliable messages and count sent and received messages... LiteNetLib reliable performance is not good compared to Lidgren-old! On my "game project" after some time, the clients simply stops receiving reliable instantiate messages and the game stops working... Another important point to mention is the memory consumption... Lidgren-old test is almost not using memory allocations at all, while LiteNetLib test uses > 2Gig...

Here is the results:

Testing Lidgren-old... Processing... CLIENT SENT -> Reliable: 261800, Unreliable: 748000 SERVER RECEIVED -> Reliable: 261800, Unreliable: 748000

Testing LiteNetLib... Processing... CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 51215, Unreliable: 750000

UDPLibBench.zip

RevenantX commented 7 years ago

@lzaiats thank you! I will investigate this. Something wrong with send speed.

RevenantX commented 7 years ago

@lzaiats high memory consumption caused by packet pooling. I think i must add some limits to pooled packets.

mgamache commented 6 years ago

By staring a second client on the same machine I was able to double the bandwidth transfer on the same connection so this suggests a rate limit inside each client. Would a bounty help you find the time to work on this issue?

RevenantX commented 6 years ago

@mgamache i know why this problem exits) As i said before - this because i don't implemented "sliding window scale" logic. This requires some time to implement.

Would a bounty help you find the time to work on this issue?

Donations always help to find the time to work on library entirely :) But i can't promise fast fix(implementaiton) right now, because i'm working on two projects. But soon (end of december) i will have some time for library features and improvements.

Saishy commented 6 years ago

Hey, remember me?

So was going to ask if you want to share profits on releasing https://github.com/Saishy/TinyBirdNet-Unity at the asset store. I was intending it to be free for free projects and paid for commercial ones. (still open source tho)

RevenantX commented 6 years ago

@lzaiats hi! Try latest version from github. I heavily optimized sending code. Your problem (with 750 clients) must be fixed. (but not benchmark)

RevenantX commented 6 years ago

@Saishy

Hey, remember me?

Yes)

So was going to ask if you want to share profits on releasing https://github.com/Saishy/TinyBirdNet-Unity at the asset store. I was intending it to be free for free projects and paid for commercial ones. (still open source tho)

You can donate any amount that you think is necessary)

lzaiats commented 6 years ago

@RevenantX Heeeey! Awesome news!

I ran the tests changing the values of the NetConstants.DefaultWindowSize and here are the results:

---- WindowSize = 16 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 11984, Unreliable: 749242

---- WindowSize = 32 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 23968, Unreliable: 749236

---- WindowSize = 64 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 46080, Unreliable: 749177

---- WindowSize = 128 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 88832, Unreliable: 749097

---- WindowSize = 256 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 137728, Unreliable: 749429

---- WindowSize = 512 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 219674, Unreliable: 749907

---- WindowSize = 1024 Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 SERVER RECEIVED -> Reliable: 262150, Unreliable: 750000

IMPRESSIVE!

I'll do some more testing using the real 750 clients and share the results with you! Thank you!

RevenantX commented 6 years ago

@lzaiats don't increase it manually. It must be increased automatically (later). My current fix must increase performance of real game server (with many clients!). (not file transfer to 1 peer)

lzaiats commented 6 years ago

@RevenantX ;) Actually I only increased the value by hand to understand the maximum throughput the lib could achieve... On the game (production) I will not touch this number ;)

One thing that is still making me a little confused is that on all circunstances, the bytes sent by the client and the bytes received by the server are almost the same... But the number of messages are completely different... Have a look:

Testing LiteNetLib... Processing... DataSize: 43537500b, 42517kb, 41mb CLIENT SENT -> Reliable: 262500, Unreliable: 750000 CLIENT STATS: BytesReceived: 36632 PacketsReceived: 1599 BytesSent: 41320993 PacketsSent: 839281 PacketLoss: 2797 PacketLossPercent: 0

SERVER RECEIVED -> Reliable: 89536, Unreliable: 749593 SERVER STATS: BytesReceived: 41335938 PacketsReceived: 839586 BytesSent: 36632 PacketsSent: 1599 PacketLoss: 0 PacketLossPercent: 0

RevenantX commented 6 years ago

@lzaiats after some time all reliable messages will be received. Add more time before showing statistics) They just in queue.

RevenantX commented 6 years ago

@lzaiats real packets sent is - PacketsSent: 839281

RevenantX commented 6 years ago

@lzaiats

I'll do some more testing using the real 750 clients and share the results with you! Thank you!

Any news?)

DanisJoyal commented 6 years ago

@Izaiats Hi,

Did you try with a lower DefaultUpdateTime ? Like 5 ms. Can it help ? Maybe the thread shouldnt stop until there is nothing else to process, wait after that.

nxrighthere commented 6 years ago

@DanisJoyal It doesn't help, I already tried.

DanisJoyal commented 6 years ago

@nxrighthere Maybe 1 ms sleep too much. Maybe it needs to none stop update until its done.

lzaiats commented 6 years ago

@DanisJoyal I tried a lot of configurations and I had no luck!

@nxrighthere Nice to see your benchmark suite! Also I made one using exact the same libs you did and the results are just the same! Using ENet wrapper gives the best result always! If you want to see good results with another lib, try this OLD lidgren version: https://code.google.com/archive/p/lidgren-network/downloads , the one from Jan 22, 2010... It's almost the same API as Lidgren-v3 but will show you results comparable to ENet ;)

@RevenantX I stopped benchmarking for now, maybe we can continue the investigation using @nxrighthere test suite now, since it's more complete and modular than mine!

DanisJoyal commented 6 years ago

@lzaiats Hi,

I just ran your UDPLibBench.zip. I ve got to adapt it to my latest version, on my fork. So far, Ive got this:

Testing LiteNetLib... Processing... CLIENT SENT -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(672) Bytes Received/Sent: 57281/50915910 Packets Received/Sent: 4215/1007890 SERVER RECEIVED -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(0) Bytes Received/Sent: 51156422/57568 Packets Received/Sent: 1012621/4241

No Issue, except a little bit packets loss on client side.

Im running 2 clients thread at the same time on the same server to see if it last.

Cheers

DanisJoyal commented 6 years ago

Oups! The result just came out of the oven :)

Testing LiteNetLib... Processing... CLIENT SENT -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(1120) Bytes Received/Sent: 58801/51260444 Packets Received/Sent: 4343/1014618 CLIENT SENT -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(1078) Bytes Received/Sent: 58805/51279396 Packets Received/Sent: 4344/1014994 SERVER RECEIVED -> Reliable: 525000, Unreliable: 1500000 PacketLossPercent: 0(0) Bytes Received/Sent: 102643216/117768 Packets Received/Sent: 2031599/8704

The DefaultWindowSize = 64;

Adding My CPU was at 100% on a i5-2700k with 2 clients

lzaiats commented 6 years ago

@DanisJoyal Wow, nice results! Using 1ms of sleep or no sleep at all?

DanisJoyal commented 6 years ago

@Izaiats

I left the UpdateTime = 15ms. But my KcpChannel overwrite this value to 10ms. But, I added a "skip sleep thread" if there was a packet sent during the Update. Maybe I can try without it.

DanisJoyal commented 6 years ago

Wow! ok !

UpdateTime = 1ms without my "Skip sleep thread"

Testing LiteNetLib... Processing... CLIENT SENT -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(1210) Bytes Received/Sent: 12226/38550732 Packets Received/Sent: 429/770182 CLIENT SENT -> Reliable: 262500, Unreliable: 750000 PacketLossPercent: 0(1268) Bytes Received/Sent: 12406/38563786 Packets Received/Sent: 444/770435 SERVER RECEIVED -> Reliable: 51521, Unreliable: 1435184 PacketLossPercent: 0(0) Bytes Received/Sent: 75644322/40118 Packets Received/Sent: 1509862/1437

Something strange about the packet loss and everything ! Server just receive 51521 on 525000. Only 1210 + 1268 packet loss ?!

And my CPU was at 70-75% instead.

DanisJoyal commented 6 years ago

@izaiats, @nxrighthere

I ran multiple times, with different settings. With the "skip sleep thread", it works perfectly.

If I use both UpdateTime = 1ms and "skip sleep thread", Im getting more packet loss from clients, like 27000 for both instead of 1000. But, the receive counts are good even with those settings.

Im done here :) Have fun.

nxrighthere commented 6 years ago

@DanisJoyal I tried to run the test with your fork. Everything goes well, as long as only 1 client is connected to the server. As soon as 2 or more clients are connected and start sending the messages, CPU usage increases to 80%-100% and my system becomes unstable.

DanisJoyal commented 6 years ago

@nxrighthere Well, Im trying to run performance profiler to optimize. It seems that SendNextPacket on the reliable channel is taking 50% of the time vs unreliable 8%. It looks like a Lock on _pendingPacket list

DanisJoyal commented 6 years ago

Well, I guess that Im wrong. ReliableChannel.SendNextPackets is always returning true. So the CPU is chocking, slowing down the clients. its suppose to take 10 sec to run. It takes 2 min instead.

Sorry guys

DanisJoyal commented 6 years ago

@nxrighthere Hi, can you try again with my latest fixes ? Its working on my side without using 100% cpu. In fact, it turns around 50% now. Thx

nxrighthere commented 6 years ago

@DanisJoyal It works better, but CPU usage is still very high compared to the master branch:

64 clients 32% vs 2%

500 clients 70% (160 was connected - overload) vs 60% (500 was connected - success)

DanisJoyal commented 6 years ago

@nxrighthere Ok, but, are you sure that the reliable channel is working perfectly ? I added a sequence check in the data sended. It failed. It sent the good number of msg, but not in the good order. Maybe its just my fork ... Im a little bit confuse ! Because the way to test it influence the result.

nxrighthere commented 6 years ago

With your fork, reliable channel as well as unreliable can't handle it. Here is the results of the master branch.

DanisJoyal commented 6 years ago

So, the master branch is working perfectly without any issue ?

nxrighthere commented 6 years ago

If no more than 500 clients are connected and send messages - yes.

DanisJoyal commented 6 years ago

@nxrighthere The real question is: How are you testing 500 clients ? Because it did my fix are based on the UDPLibBench.zip. And Im not sure that its a good way to test the lib. Can you share your code ?

Thank a lot.

nxrighthere commented 6 years ago

Sure, here it is.

RevenantX commented 6 years ago

@DanisJoyal @nxrighthere found some places that i can optimize. So i will upload improved version soon. But i see that most CPU usage in socket.send :)

NVentimiglia commented 6 years ago

I noticed this CPU use as well.

Right now on my stress tool, websockets is out performing UDP on more than 200 connections.

RevenantX commented 6 years ago

@nxrighthere @NVentimiglia this bench is not fully correct. Server can handle many connections. If that 500 or more clients on separate machine) This CPU usage is just because 500 clients logic slowing down whole system.

DanisJoyal commented 6 years ago

@RevenantX

You are right, but its showing how much ressources are you taking. For example, if Enet or lidgren can do the test but not litenetlib, there is an issue in the runtime of litenetlib. Its just an example. No ?

So far, Enet is unbeatable. But, in pure C#, liteNetLib is the best from the tests I did.

NVentimiglia commented 6 years ago

01-07-2018

-MOCK- Connected : 100 Requests Per Second : 1,652,450.17 Latency : 0.00 ms

-UDP- Connected : 8 Requests Per Second : 2,856.17 Latency : 2.27 ms

Connected : 100 Requests Per Second : 978.02 Latency : 101.06 ms

-WS- Connected : 100 Requests Per Second : 3,660.91 Latency : 59.17 ms

Side note, I have a high level library with an agnostic transport layer. I can scale WS and Mock connections to 5k connections without much slow down. I have tried my UDP connection using my own async threading and the internal que with little variation between the two. My implementation is pretty close to the example implementation. The only real difference is the fact that I serialize and deserialize. I dont think its my code, as the mock and WS go through this same system. Moreover, I have replicated this problem without my high level library. The test is a simple echo test.

RevenantX commented 6 years ago

@DanisJoyal Lidgren and LiteNetLib passed and failed same tests. Larger memory consumption is because of larger pool (you can check this by setting PoolSize to lower values). Enet can handle because of congestion control and this is "C" library that obviously faster :)

nxrighthere commented 6 years ago

Everything is fine with the benchmark, the problem in the library. @DanisJoyal has successfully fixed some things and here's the result with 2000 clients.

DanisJoyal commented 6 years ago

@RevenantX

I did some optimization. The array: _peers was blocking the flow mostly. I just did a pull request about it. On my PC, I can run almost 2000 clients at the same time.

master2000 optimized2000

But, It should take 2 mins to complete the test. So, its still taking too much ressources.

RevenantX commented 6 years ago

@DanisJoyal @nxrighthere i merged changes into separate branch and tested. And on local machine i don't see any improvements O_o. Do you use release version of library and becnh? or debug?

DanisJoyal commented 6 years ago

@RevenantX

You need to push it at the limit. If you re telling me that both fail at the same time, ok ... its strange. Because I ran tests during 4-5 hrs and I saw an improvement.

nxrighthere commented 6 years ago

@RevenantX Can you please tell me your CPU model?

RevenantX commented 6 years ago

@DanisJoyal no, there no failes. But i tested 200 clients and got 70-100% cpu usage just by socket.send() (i5-4690K 4.2 GHz) in both tests.

RevenantX commented 6 years ago

@nxrighthere i5-4690K 4.2 GHz (slightly overclocked)

RevenantX / LiteNetLib

Implement sliding window scale logic. #110

01-07-2018