Open weissi opened 3 years ago
Hmm, so while I agree with you that we should document things, I think EventLoopFuture
does not actually express any "happens before" relationship w.r.t its callbacks. EventLoopFuture
's happens-before relationship is derived solely from the EventLoop
. This can be observed with EmbeddedEventLoop
, and indeed if any of the above event loops were replaced with an EmbeddedEventLoop
then the safety guarantees of the program evaporate.
To this end, I think we need to explore two related concepts:
What the memory order is for a thread safe EventLoop
implementation: that is, the appropriate barriers and edges that an implementation must provide. Concretely, I believe there are only three actual guarantees required here:
EventLoop.{submit,execute,scheduleTask}
happens before the closure passed to that function begins executing (this implies a full barrier associated with these operations).From these guarantees the EventLoopFuture guarantee is built:
EventLoopFuture
callback happens before (i.e. returns before) any call to an EventLoopFuture
callback enqueued later by the same thread. This is achieved by the fact that EventLoopFuture
callbacks are added by way of submitting work to the owning event loop.I believe that this also derives an important property of event loop task execution, which is that it is linearizable. Enqueuing a task on an event loop happens before the execution of the task body. The execution of a task happens before the execution of any other task submitted to the loop. As a result, tasks submitted to an event loop form a history that meets the requirements of strict serializability: each task submitted to an EventLoop task queue completes before the next one begins, and submissions that have a happens-before relationship to each other persist that happens-before relationship into the task queue (assuming deadlines are not a factor).
There is some more detail here to flesh out around the role deadlines play (concretely, that tasks submitted with equal deadlines but without a "happens before" relationship between them are unordered, that execute
/submit
both take a deadline of now()
, etc.), but this is roughly the right shape.
I suggest noting things this way because it allows us to more concretely talk about EmbeddedEventLoop
. EmbeddedEventLoop
explicitly breaks the rules of event loops: it does not provide our happens-before relationships at all. execute
/submit
and friends meet part 1 and part 2, but they miserably fail at part 3: calling execute
from multiple threads allows the closure bodies to execute simultaneously.
We allow this for EEL
because its purpose is as a verified mock, not as a complete loop, but it's important that we acknowledge the constraint. We should also consider whether it's worth considering making EEL thread-safe but controllable, to avoid this risk, though that's a harder thing to do.
A quick note: the above sketch is the guarantee we should be requiring loops to make, but currently it's not the one they actually make. In particular, we have a slightly weaker version of rule 2 for our existing loops, as two tasks submitted with the same deadline are unordered with respect to each other (i.e. they may not execute in program order). That's covered by #1541, and we should fix it as part of this work, or @aphyr will laugh at us.
An ELF does guarantee ordering and therefore happens before:
The above example would work just fine with EmbeddedEventLoop
it'd just never thread-switch because it's all embedded into the main thread.
Of course, for everything to make sense, we also require the EL to actually be thread-safe (which EmbeddedEL isn't so you mustn't thread-switch if you use it).
So yeah, your first guarantee (about ELs) sounds about right. We should use the same language as the Swift memory model/Concurrency to not confuse problems.
For the second guarantee, we should remove the word "thread". We should require in the EventLoop
protocol that any given EL can obviously only run one thing at a time although for me that's kinda implied already. Nevertheless, explicit over implicit.
You're right that the ELF does provide a happens-before relationship to its callbacks that is conceptually somewhat separate from the happens-before relationship provided by event loop submission, good spot. That's probably rule 4 then.
Agree that the second guarantee shouldn't use the word "thread". Here's a proposed rewording, which also subsumes rule 3:
a call to any closure A passed to any of the above submission functions returns before any call to a closure B submitted later in program execution (that is, where the submission of closure A happens before submission of closure B) begins executing, if the deadlines are the same
We should require in the EventLoop protocol that any given EL can obviously only run one thing at a time although for me that's kinda implied already.
I believe this is what rule 3 tried to encapsulate, but as I re-read it I realise that it didn't. So here's a new version of rule 3
all calls to work items enqueued on an event loop (including I/O operations) must form a strict serializable history: that is, work items must execute atomically (only one work item may execute at a time, and runs until completion), and they execute in deadline order, where tasks that have the same deadline must execute in the order they were submitted to the event loop if such an order exists (as defined by the Swift memory model and the SwiftNIO memory model)
Not as concise, but probably more effective.
Sounds good.
Although
where tasks that have the same deadline must execute in the order they were submitted to the event loop
we don't do that right now. If you make a NIODeadline
and you schedule
multiple things to that deadline, they'll run in random order. Two calls to execute
will run in the correct order but this isn't actually programmed, it relies on the TSC always advancing and therefore producing a new deadline.
that is, work items must execute atomically (only one work item may execute at a time, and runs until completion)
I agree with this. But we should make sure that it's not confusing. For example if you do this
el.execute {
// BEGIN: work item 1
let p = el.makePromise(of: Void.self)
p.futureResult.whenComplete { _ in
// this will within work item 1. Is this clear to the user that this is _not_ its own work item?
}
p.succeed(())
// END: work item 1
}
@weissi Yup, I already linked to the relevant issue we have to fix above.
As to your second point, I think we need to write the rules in such a way that they can be meaningfully applied to the program. In this case, users can only know whether things are within the same work item based on examining their documentation. Arbitrary closures may or may not be part of the same work item, and it's not possible to know that for all types ahead of time without studying their documentation. In this case, EventLoopFuture may need clearer docs about when exactly future callbacks execute. I don't think it impacts the memory ordering documentation though.
Agreed.
Now that Swift has a memory consistency model, we should clarify how
EventLoop
s andEventLoopFuture
s tie into that.For example, is the following program correct?
Clearly it reads and writes
counter
from about 3 different threads without any explicit synchronisation. However, the EventLoopFutures establish a clear before/after memory ordering and in the current implementations ofSelectableEventLoop
(because if the locks which are full mem barriers) andNIOTSEventLoop
(because ofDispatchQueue
's guarantees) this is actually totally safe. But it's never stated anywhere that this is in fact safe. So should the programmer assume this is unsafe and resort to aNIOAtomic<Int>
or aLock
? I don't think so.I think we should add to the docs of
EventLoop
andEventLoopFuture
a few guarantees regarding memory orders. Most importantly:.execute
/.schedule
/.submit
/.wait()
... have full barriers and are ordered w.r.t. other such calls (and ofc future callbacks which use these underlying methods)