apple / swift-nio

Event-driven network application framework for high performance protocol servers & clients, non-blocking.
https://swiftpackageindex.com/apple/swift-nio/documentation
Apache License 2.0
7.98k stars 652 forks source link

Think about adding a poll-loop mode where Selectable EventLoop can be driven from other threads without taking them over #2074

Open weissi opened 2 years ago

weissi commented 2 years ago

Recently, two projects (PureSwift/Socket and swhitty/FlyingFox) have caught my attention that both run BSD sockets code inside Swift Concurrency's threads. Which until Custom Executors arrive isn't really possible to do (without polling).

Let's first establish what exactly these libraries are doing and why: Without blocking one (or more) of Swift Concurrency's threads (potentially) forever (which should be avoided at all cost) you cannot do any I/O in there unless you implement a "poll loop". A poll loop constitutes scheduling a "poll" to quickly check all sockets for available events every so often [returning control into Swift Concurrency's default executor when no work has to be done]. For example, you could schedule a poll of the sockets every 10 milliseconds. If there are available events (such as a socket is readable/writable/...) these events get handled and control is returned back to the Swift Concurrency's default executor.

So far, this sounds kinda good: We can share a thread with Swift Concurrency, we can do I/O and we don't have to switch threads. So why does SwiftNIO (currently) no offer to run in this mode? Because there are a number of severe issues:

In other words: The better the latency is the worse the idle energy consumption is.

Personally, I have not come across a scenario where implementing such a poll loop is compelling which is why I never really thought that SwiftNIO should add a mode for running in a poll loop (which wouldn't be hard to implement).

Why do I think that a poll loop isn't compelling? SwiftNIO interops just fine with Swift Concurrency if given its own thread pool where it does I/O. Yes, to interact with Swift Concurrency code it will need to thread switch from and to the Swift Concurrency threads (from NIO's I/O threads, usually called the EventLoop). And yes, switching threads adds latency. But: As discussed above, the poll loops also adds (much more) latency and additionally burn energy when idle. To be competitive in latency with a traditional SwiftNIO setup where NIO has its own threads and you switch back and forth you'd need to schedule the poll loop timer so fast that you'd be burning a lot of energy for no good reason.

So why am I even filing this issue? Maybe I missed something and there is in fact a scenario where such a poll loop is compelling. If that's the case, please comment. If convincing I think SwiftNIO should just add a mode where it can run as a pool loop scheduled in somebody else's thread, the implementation shouldn't be hard at all.

hassila commented 2 years ago

Just a related thought (not advocating for a poll loop mode though)

For a complex server app with full usage of Concurrency and wanting to adhere to the runtime contract of never blocking a thread from the concurrent thread pool, I think it'd make sense to limit the default number of threads used by the I/O subsystem to 1 for most servers - conceptually I think one would really want to take one of the concurrent thread pool threads for this and pin it for this specific task (otherwise I'd like to create a dedicated thread for this and pin it to a CPU and ensure that the concurrent thread pool would be 1 smaller... So same effect).

This thread would be responsible for reaping asynchronous IO completion events and then async schedule processing of them on the normal concurrent thread pool shared across the whole server application. This thread should be pinned to this specific task and simply interact with the kernel and dispatch work async on the concurrent thread pool. Something needs to act as a bridge to the outside world if we want to communicate in a reasonable way.

For any server application that wants to communicate with the outside world, you've basically either need to offer a thread or do polling for this (and I can't see the use in polling myself, but I come from the viewpoint of having plenty of resources available, it seems the linked projects come from the other part of the spectrum with resource constrained environments).

As discussed in separate issues (e.g. https://github.com/apple/swift-nio/issues/1805 and https://github.com/apple/swift-nio/issues/1890), we need to look at overall NIO architecture as well as buffer memory ownership regardless for proper io_uring support (and not bad for legacy IO subsystems as well as kqueue, although I really hope the kernel team would consider an asynchronous sys call facility similar to io_uring in the future for macOS).

It fundamentally comes back to your comments there about a holistic view of a hypothetical "nio 3.0" and the very real opportunity to take a fresh look at network IO in view of Concurrency.

(sorry, a bit off topic, but I think it has great potential and I would be happy to participate in such a revamp - I think a proper io_uring enabled Swift with Concurrency (and optimised proper concurrent dispatch queue support) would give a fantastic building block for heavy duty server applications)

Lukasa commented 2 years ago

Broadly I agree with this. Even more broadly I've long advocated that the overwhelming majority of Swift on Server applications should only use one NIO thread unless they have compelling evidence that NIO, specifically, cannot keep up. NIO is very fast, and it is not doing important business work, so letting it take over your entire machine is really not the best use of your performance. NIO also self-optimizes in the only-one-event-loop-thread case, where it will elide all thread hopping and lock contention will drop to almost zero.

I remain optimistic that we don't need a NIO3 to address most of the issues required for proper io_uring support: I think they all exist within the already-existing abstraction, so that shouldn't be a huge issue.

I think my TL;DR here is: we're basically already where we need to be, if that's the model we want to land on. For my part I think that it is, and hopefully we'll increasingly add more examples of what it's like to have a NIO server consuming one thread alongside the concurrency thread pool.

weissi commented 1 year ago

[...] Without blocking one (or more) of Swift Concurrency's threads (potentially) forever (which should be avoided at all cost) you cannot do any I/O in there unless you implement a "poll loop".

Just to be sure this isn't misunderstood because after re-reading this almost a year later I think what I wrote back then is incomplete.

There is an option that works without a poll loop: In theory we could try to do at least write I/Os straight from the Swift Concurrency pools. But to retain correct ordering w.r.t. the order of writes & closes [both absolutely required] we would need to add a lock or some other form of exclusivity mechanism to every file descriptor (Channel). If (using the lock) we can establish that there are no queued writes, we could just issue the non-blocking write call straight from the concurrency pools. And if it works, that's great and if it doesn't then we can still enqueue the write.

Said that, whilst this is in theory possible, I don't think it makes any sense either. This would require all threads (including the Channel's own EventLoop) to acquire this lock all the time (to write, enqueue writes, close, ...). And having this lock invalidates pretty much all the reasons why SwiftNIO is fast. SwiftNIO is fast because it can do I/O etc without having to take any locks because it knows that all I/O associated with a given Channel is already on the correct EventLoop and nothing else can invalidate/change the Channel whilst the code is running. That means SwiftNIO will scale linearly with the number of cores which is really important especially in server environments where huge amounts of unrelated concurrency are the normal mode of operation.