100% CPU Utilisation in Akka.IO server.

lewishazell commented 8 years ago

I'm new to the code so I hope I'm not missing something here...

This issue was reported before and closed, but is not fixed. I have 100% CPU usage on one core when running an Akka.IO TCP server. I've looked into the issue and found that the problem is that C# provides no way to interrupt/cancel Socket.Select() and Akka.NET is running in a continuous loop because the listening socket (correctly) never leaves the _read list.

Here is the Scala code, vs the C# code for SelectionHandler.Execute().

The important part is the selector.wakeUp() method. In scala, the selector waits forever, until either something happens on any of the sockets or wakeUp() is called within execute() to make the thread process the action that was pushed to the queue instead.

It seems there is no equivalent wakeUp() method in C#. One way I can think is to set no timeout (pass -1 to microSeconds argument in Socket.Select()) and connect an "interrupt socket" to the port after binding and, when we wish to exit Socket.Select(), send an empty byte array through this "interrupt socket". I don't know how this would affect performance, however. It's a hack, and potentially too slow for Akka.

Another, perhaps more performant, option is to redesign the socket server so it provides the same functionality without being exactly the same under the hood. Although, perhaps using SocketAsyncEventArgs, we can wrap this in a "Selector" class that behaves like java.nio.channels.Selector to keep the code looking similar to the Scala? I'm not sure if this is possible, though, so would need to be looked in to.

Any thoughts?

Aaronontheweb commented 8 years ago

@Lewis-H Hi Lewis, going to cc some of the folks working on Akka.Streams here: @Horusiath @Silv3rcircl3 @cconstantin

Akka.IO needs to be rearchitected, for the reasons you point out. It really should be using SocketAsyncEventArgs so it can take advantage of io completion ports and be asynchronous. It currently does not today. There will be an effort to redo that in the not too distant future, since Akka.Streams.IO depends on it.

lewishazell commented 8 years ago

@Aaronontheweb Brilliant, at least I'm not overlooking something. I wondered what the SocketAsyncEventArgsPool was doing there unused! I'll keep up with the development on that and try to help out where possible, even if it's just testing.

Thanks.

Aaronontheweb commented 8 years ago

@Lewis-H if you can change out the underlying socket implementation to use SocketAsyncEventArgs under the hood, that's a pull request we'd happily accept if you're interested in contributing. Otherwise we're going to end up having to do that ourselves, but it's a lower priority than some other bits being worked on at the moment (the 1.1 release is mostly focused on Akka.Remote, Cluster, and releasing the beta for Akka.Streams.)

lewishazell commented 8 years ago

@Aaronontheweb Okay, since it's low priority at the moment but will be needed, I'll start looking in to it. I really like the library so I'm happy to contribute.

marcpiechura commented 8 years ago

@Lewis-H regarding to Akka.Streams, we have ported all code for IO but most of the specs are failing and therefore we marked them as skipped, you can find them here. So if you have updated the Akka.IO implementation you could try to enable them again to see if they pass.

lewishazell commented 8 years ago

Thanks for that @Silv3rcircl3, I'll try that out when it's done.

So the a big problem here is that SocketAsyncEventArgs makes calls to unmanaged code which means the bytes get pinned to memory. The GC can't touch them, which will cause memory fragmentation if the code creates new bytes as they're needed and I think they cannot be collected even if they're not referenced anymore. I'm not sure if SocketAsyncEventArgs.Dispose() would unpin the bytes; it'd cause an OutOfMemoryException eventually, if not. On the upside, I believe we can create instances of SocketAsyncEventArgs when there aren't enough in the pool and dispose of them after.

I've made a mock up echo server to test some ideas (sorry for its crudeness). I allocate 1MiB to the buffer pool at the beginning and use 1KiB sections of that at a time for each operation, pulling the section from the ConcurrentStack. When waiting for a read, I give the operation no bytes so there isn't 1KiB being held indefinitely when waiting for a read operation. Also, the SocketAsyncEventArgs would obviously be pooled in the real version.

I realise this would be diverging from akka a bit (if that's a problem?) and would require extra configuration options for the buffer pool size/buffer size, but it's the best I can think of given the limitations. On the other hand, the configuration options would provide fine tuning of the socket server for the needs of the application.

So, shall I implement with these ideas (obviously with neater code)? Any other considerations I need to make?

Horusiath commented 8 years ago

@Lewis-H we already use byte buffers in form of ByteString class - it's implementation is close to the original residing in JVM, but I think we can change it if necessary, as long as this won't completely break existing API. Other known implementations I'm aware of:

Event Store buffer manager
DotNetty ByteBuffer
RampUp Buffer - this one is fast but very low level, and requires considerable amount of "mechanical sympathy" ;)

Another suggestions concerning implementation: I've heard a lot of good things about TCP connection in EventStore. It's open sourced, so you could take a look at it. I know that they are pooling their SocketAsyncEventArgs.

marcpiechura commented 8 years ago

@Lewis-H any updates on this one?

fergusn commented 8 years ago

I've done some work on the redesign using SocketAsyncEventArgs and the initial results look very promising. There is a substantial improvement in performance, GC pressure and CPU utilization.

This will also remove the 65k connection limit inherited from Socket.Select - which is important in my use case where I will have a couple of million concurrent long lived connections.

I will complete the redesign and post more results soon. @akkadotnet/core. Should this work go onto a separate branch?

Horusiath commented 8 years ago

@willieferguson that's great to hear :) I'd love to see this new design of yours. It's hard to say if it needs to be published into separate branch - under normal circumstances I would say, that if it passes specs, it's good to go on the dev. However I feel we don't have GC/CPU/Memory utilization as part of the test suite, so this may be too risky atm.

Btw. is there any chance, that your new implementation passes Akka Streams TCP spec? @Silv3rcircl3 was working on it, but it wasn't possible to make it pass without deeper refactoring into guts of Akka.IO itself.

Danthar commented 8 years ago

Since its a redesign, thats potentially not shippable at this time. I vote for a seperate branch to house the akka.io changes. That way we can pull it into dev on our terms, without having it being an obstacle in the release schedule.

fergusn commented 8 years ago

@Horusiath I've looked at the TcpSpec failures and they were cause by a couple of bugs in the spec and Akka IO - see willieferguson/akka.net@cb57f2630030b863cddce287e15b235e63fdeb17. The spec now pass for the current Akka IO design as well as the new one.

I need to change the UDP connection to the new design and will then push the new design for review.

alexvaluyskiy commented 8 years ago

@willieferguson Do you have a plan to make a pull request with your changes to Akka.IO? It is very important not only for Akka.IO itself, but for Akka Netcore as well

fergusn commented 8 years ago

@Horusiath You can look at the new design at willieferguson@02fddf9e0ddcb1915f14154d90bc25f553f60ba1.

It uses SocketAsyncEventArgs, and completed events are send to the connection actor as messages.

There are a couple of outstanding items:

If the actor is stopped and the socket is disposed, 'oustanding' events will be send to deadletters. The managers therefore subscribe to deadletters to returned any leaked SAEA's to the pool. I can't think of another solution.
The SAEA pool currently pre-allocate all instances. I will look at other implementations, but I assume the trade of here is larger initial memory footprint vs fragmented pinned memory.
More integration testes are required.
Need to complete benchmarks of old vs new design.

Let me know if I should create a PR.

Horusiath commented 8 years ago

@willieferguson I've created a branch for your changes: https://github.com/akkadotnet/akka.net/tree/akka-io . Can you send a PR with your changes there? This way others will be able to collaborate on upgrades with you.

marcpiechura commented 8 years ago

PR was merged in the IO-branch, we should create a new ticket if necessary in the future.

akkadotnet / akka.net

100% CPU Utilisation in Akka.IO server. #2108