dusk-network / dusk-blockchain

Reference implementation of the DUSK Network node, written in Golang
MIT License
101 stars 47 forks source link

Implement priority queue on inbound messages #1187

Open autholykos opened 2 years ago

autholykos commented 2 years ago

Describe what you want implemented A queue that sidesteps the non-determinism of golang scheduling system: https://github.com/dusk-network/dusk-blockchain/blob/f44e1055887cf2a71fd05ec346fc3461cf223d44/pkg/p2p/peer/peer.go#L340

Describe "Why" this is needed There is a strong suspicion that some consensus messages might be unlawfully delayed by the golang scheduler, potentially creating problems.

Describe alternatives you've considered Splitting the inbound messages between a priority queue and the normal scheduler as is

Additional context This might be related to the root cause of #1156

l0k18 commented 2 years ago

After ad-nauseum consideration of the issue I came up with this final, simple-as-possible way to implement it:

P = priority N = non-priority

For this math, 1 is zero, because zero destroys the proportion. So this is kinda ordinals.

(P+1)/(N+1)>=2

P at 3 and N at 1 gives a result that satisfies this. So does these:

1, 0 2, 0 3, 0 4, 1 5, 2 6, 2 6, 3

When this condition is satisfied, the queue consumes a priority item. Otherwise, it consumes a non-priority item.

l0k18 commented 2 years ago

Further refinement to add a modulus based forced selection to ensure the less filled queue still gets picked on.

closes #1206

l0k18 commented 2 years ago

Current status is a simple test that demonstrates and validates the queue does not neglect non-priority items when they are being overwhelmed by priority messages.

The previous loop with unbounded spawning of goroutines to process messages has been changed into two goroutines, one which receives messages, the other which picks messages off the queue and processes them in a single thread.

The issue of keeping everything asynchronous and avoiding spinlocks and polling lead to the addition of a multi-lane semaphore that causes the consumer to block waiting for a producer to deliver a message. The semaphore channel is loaded at the same time as the message channel, and the reader waits on a semaphore to read from the queue, in this way, until there is an item on the semaphore channel there can't be a message so any call to Pop then waits until Push has been called to provide a new semaphore.

The only thing that may be needed is if the work queue should be parallelized or not. Currently it is designed to be a single thread. Making it multiple thread is a matter of adding a loop to spawn the queue consumer thread multiple times.

l0k18 commented 2 years ago

The dangling element to close this issue is that in the test the callback fails to execute, for which reason a timeout failure mode was added.