Open ysbaddaden opened 1 month ago
Let's note that with many event loops, we could still do the same (enqueue the fd is all event loops), but then all event loops would be notified about the fd readyness, whether they care about it or not, and won't be just once (thanks to edge triggered) but everytime a fiber reached EAGAIN, which would repeatedly interrupt other event loops.
We'd also have to iterate all event loops on each fd/socket open/close, keep a map of fd => node for each event loop, and so on.
We identified different issues with a single evloop per process:
We tried to have a dedicated thread running the evloop to cirsumvent issue 1, but we then fall into issue 2 where all enqueues are always cross context, and performance degrades.
@straight-shoota proposed to still have multiple evloops, and to keep the fd
into the evloop only it is initially added to. Enqueues should usually be local, though sometimes they might be to another context. Said differently, an evloop would take ownership of an fd
.
A downside is that if you have a server running in a context, creating client connections, doing some user/vhost authentication and so on, then pass the client socket to another context to handle the actual communication, then all enqueues from the evloop would be cross context, which would hinder performance.
To allow this scenario, we proposed to transfer the ownership of an fd
to another evloop when a fiber would block in another context: the fd shall be removed from the previous evloop and added to the new evloop.
An advantage is that the fd
will follow the context it's running in, enqueues from the evloop should usually be local.
A drawback is that the fd
may keep jumping from one evloop to another if it jumps, which means that we'd be back to the current add/remove scheme we currently have with libevent. We also consider this scenario to not happen frequently. Usually a fd
should even be owned by a fiber, and only sometimes transferred to another fiber in another context.
An example scenario that wouldn't fare well, is trying to print messages to a socket from different contexts; the fd
would keep jumping across contexts, and performance be impacted. But to writing to an IO from different threads is unsafe —even with sync
you'll end up with intertwined messages— you'll need something to protect writes (e.g. mutex), in which case you should use a Channel and spawn a dedicated Fiber to do the writes —this is exactly how Log
is working. Then, the fd
will stop jumping from one evloop to another.
I implemented this, codenamed the "lifetime" evloop against the current crystal (without execution contexts). The initial results were attractive, and the semi-final results are impressive:
With the basic HTTP::Server bench, tested with wrk
client:
preview_mt/libevent mt:2 => 131k req/s
preview_mt/epoll mt:2 => 158k req/s (+20%)
preview_mt/libevent mt:4 => 177k req/s
preview_mt/epoll mt:4 => 214k req/s (+20%)
NOTE: wrk
starts N connections and keeps them open to run as many HTTP requests as possible, which is the ideal scenario.
I presume there'll be some EC impls with different event loops kept around to ensure that the design doesn't preclude per-EC event loops or global event loops?
Instead of one EL per thread (
preview_mt
) and instead of one EL per EC (#7), we could have one single EL for the whole process!We currently follow the libevent logic (even in #29) where we repeatedly add and remove a fd from the system epoll or kqueue instance, which means that whenever a fiber would block trying to read (or write) we add the fd to the polling instance, suspend, then on wakeup we remove it, read until we block again and we repeat (add -> wait -> del).
With a single event loop instance, we could add the fd when an
IO::FileDescriptor
orSocket
is created (registering for both read/write) then remove it when we close it. That's it.Advantages:
EventQueue::Node
right intoIO::FileDescriptor
&Socket
;EventQueue
to map fd => Node: the Node can directly link the IO object;Disadvantages:
fd
across contexts, prefer to move it);To limit the "one context processes events for another context", there could be a dedicated evloop thread always waiting on the eventloop. Ready fibers would then always be globally enqueued, which might defeat the stealing logic (we can grab more fibers out of the global queue instead of dividing).