probe-lab / go-kademlia

Generic Go Kademlia implementation
Other
18 stars 4 forks source link

Keeping the Scheduler's Single Worker Awake #106

Open guillaumemichel opened 1 year ago

guillaumemichel commented 1 year ago

The Scheduler's Single Worker model is meant to enforce a sequential execution, allowing for stable testing and reproducible protocol simulation. In a single thread simulation, the Single Worker is the only one to add and consume elements from the scheduler. When there are no more actions to be run at a given time, the Single Worker advances the scheduler fake clock to the next scheduled action time, and continues from there. Hence, the Single Worker will be able to complete the simulation in a sequential manner without ever sleeping.

However, in a real world scenario, multiple different go routines need to add elements to the scheduler, and only the Single Worker will consume them one-by-one. This means that once the action queue is empty the Single Worker cannot simply sleep until the next scheduled action, because another thread may add a task to the queue, and the Single Worker is expected to handle it ASAP. There are multiple ways to make sure that a Single Worker is Awake and ready to take a task when necessary:

  1. Block the Single Worker. Instead of sleeping, the Single Worker could block on a select, it is then unblocked by the first event happening between (1) next scheduled action from the Scheduler, (2) another action being added to the queue, and writing to the Alarm (Wakeup?) channel or (3) context cancelled.
  2. Disposable Single Workers. Once the Single Worker is done with the event queue, they can go home. When another go routine (not a Single Worker) adds a task to the event queue, it verifies (through the scheduler) whether there is Single Worker on duty. If there is no Single Worker, it spawns a new Single Worker (go createSingleWorker()) to tackle this new task (and possibly the ones that will arrive while the single worker is on duty). If there is already a Single Worker on duty, it will run the newly added task. This solution means that an additional go routine would be required to keep track of scheduled events, and make sure to spawn a Single Worker if required when a scheduled event is due.
  3. Application polling. The application making use of the DHT implementation/go-kademlia, i.e Kubo for the go-libp2p-kad-dht implementation, is responsible for periodically polling the queue and run any actions that would have been added by other go routines. This approach is inspired by rust-libp2p. During a poll, all actions from the event queue are run until the queue is empty. Then, control is returned to the caller (application). Other go routines may add events to the queue, or scheduled events may become overdue, these events will be handled during the next poll. This approach seems unfit for Kubo since Kubo doesn't implement a hierarchical state machine architecture.
  4. Other suggestions?

It is possible to implement multiple schedulers / coordinators, one for each of the described behaviors. An application will only use a single behavior, but different applications may have different needs. However, I suggest we focus on implementing only the behavior required by Kubo/go-libp2p-kad-dht for now.

@iand @dennis-tra WDYT?

iand commented 1 year ago

My strong preference is to rely on select and blocking for synchronization and coordination. As far as possible we should avoid replicating features that Go already provides at the language level. A component should indicated it is ready by sending on a channel, blocking until a coordinating component reads the action.

The original version of the coordinator drove the state machines via an event channel and goroutine, which is very similar to proposal 1:

https://github.com/plprobelab/go-kademlia/blob/40722c4b25b192617cbfa8b789d835ebd8078792/examples/statemachine/handler.go#L44-L64

This was removed to align more with the scheduler on the assumption that something else would be driving the scheduler.

The scheduler model is complicated by the use of future planned action. This means we have to implement a cron with cancellation rather than just rely on a simple select loop. I think that would look a lot like the simulator code and it would send actions on a channel when the appropriate time is reached. However note that this leads to non-deterministic behaviour: if two channels are ready in a select statement then Go makes a pseudorandom choice between them.

Wherever we can we should try to eliminate work that relies on timed behaviour. The query Pool state machine achieves this by treating a timeout as a mechanism to free up capacity. If the pool is under capacity then timeouts won't be used (I copied this idea from rust-libp2p). However it does require the state machine to be polled, but this is low cost when compared with writing an efficient cron scheduler (a later version of the coordinator included a heartbeat: https://github.com/plprobelab/go-kademlia/blob/7dde002254c492179d2e728ec30f57df27f5aab7/coord/coordinator.go#L103-L106

Note that in the rust-libp2p model poll advances the state machines by a minimal amount of work, not necessarily to completion. But it can do that since it is state machines all the way down. We have to interface with go-libp2p which has a different execution model, so we need to bridge by using some kind of long running execution loop.

In reality I think we need more than one "action queue" otherwise it's easy for one component to impact performance of another. Think of making a request to send a message where the response is scheduled once it is received. Another component can fill the action queue with thousands actions in the meantime, delaying the processing of a response, perhaps even exceeding a pre-planned timeout. We should also be careful with fan-out behaviour, for example the result of a find closer nodes action producing 20 new actions that attempt to suggest the 20 new nodes for the routing table.

I have an experimental dht/coordinator that separates IO from internally generated actions. Externally injected actions are also separate in this experiment because I think it's important that we can prioritize work and provide backpressure. For example we should prioritize completion of queries that are making progress over starting new queries. The execution model needs to be able to support this.

iand commented 1 year ago

A side note on the existing event.EventQueue interface. As it stands it encourages racy behaviour by providing a Size method independent of Queue/Dequeue. The ChanQueue implementation doesn't protect against this race (it would need to add a mutex that guards access to the channel). This pattern is used in the event.Empty function used by SimpleScheduler.NextActionTime (and also in ChanQueue.Dequeue itself)

If we were to adopt bare channels rather than wrap them in an interface then we could use queues in select statements and rely on Go's handling of the synchronization.

guillaumemichel commented 1 year ago

@iand I agree with everything you wrote!

Wherever we can we should try to eliminate work that relies on timed behaviour. The query Pool state machine achieves this by treating a timeout as a mechanism to free up capacity. If the pool is under capacity then timeouts won't be used (I copied this idea from rust-libp2p). However it does require the state machine to be polled, but this is low cost when compared with writing an efficient cron scheduler (a later version of the coordinator included a heartbeat:

👍🏻

In reality I think we need more than one "action queue" otherwise it's easy for one component to impact performance of another. Think of making a request to send a message where the response is scheduled once it is received. Another component can fill the action queue with thousands actions in the meantime, delaying the processing of a response, perhaps even exceeding a pre-planned timeout. We should also be careful with fan-out behaviour, for example the result of a find closer nodes action producing 20 new actions that attempt to suggest the 20 new nodes for the routing table.

That would be awesome!

I have an experimental dht/coordinator that separates IO from internally generated actions. Externally injected actions are also separate in this experiment because I think it's important that we can prioritize work and provide backpressure. For example we should prioritize completion of queries that are making progress over starting new queries. The execution model needs to be able to support this.

This sounds great!

If we were to adopt bare channels rather than wrap them in an interface then we could use queues in select statements and rely on Go's handling of the synchronization.

Agreed!