probe-lab / zikade

A Go implementation of the libp2p Kademlia DHT specification
Other
8 stars 3 forks source link

fix: avoid deadlocks in query and broadcast behaviours #63

Closed iand closed 9 months ago

iand commented 9 months ago

The query and broadcast behaviours notify query initiators of the ongoing progress of a query or broadcast. They also notify when the query or broadcast has finished.

This set of changes fixes two types of deadlock:

  1. slow consumer - originally the notification was sent on a channel with no defined behaviour for slow consumers of the channel. This could cause the query behaviour to block preventing any other query from progressing. This was particularly evident when a query completes since the last successful response was notified followed immediately by a finished notification to the same channel. This has been fixed by intruducing a QueryMonitor type that buffers progress events that cannot be notified. The QueryMonitor also uses a separate channel for notifying the completion of a query or broadcast and has better defined semantics for when notifications will be sent and when they will stop being sent.
  2. reentrancy - originally the Notify and Perform methods of the behaviours were guarded by a single mutex since both could advance the state of the embedded state machine. However, notifying a query initiator could cause the intiator to call the Notify method to stop the query. Since the notification was made while the mutex was held this would deadlock on the call to Notify. This has been fixed by separating the locking behaviour between Notify and Perform and refactoring the logic to ensure that the state machines are advanced by Perform only. Notify now only queues the inbound event.