nsqio / go-nsq

The official Go package for NSQ
MIT License
2.59k stars 444 forks source link

consumer: lower old RDYs first, then assign to new connection #341

Closed karalabe closed 2 years ago

karalabe commented 2 years ago

Currently when a consumer is connected to a new NSQD, the connection is added to the pool of connections and then a loop iterates over all of them, rebalancing the RDY values. The iteration order of a map is random, thus it can happen that the consumer tries to assign the RDY value to the new connection, before decreasing the existing ones. This leads to a RDY of 0 being assigned.

This issue is even more pronounced when there are only two connections: initially all the RDY are assigned the existing connection, and when a new NSQD is connected, the old one needs to be cut in half and the half just taken away gets assigned to the new connection. If map iteration happens in reverse order - starting with the new connection - it will get assigned 0 RDY and then the old connection gets cut in half. The end result is blocked communication on the new NSQD instance until a new round of rebalance is triggered.

This issue may be less relevant in long running processes, but it is very annoying in tests where we're adding and remocing NSQD instances and the test hangs from time to time due to a flaky RDY allocation.

The issue is fixed by fist iterating over all the existing connections and rebalancing them, and only at the very end calling maybeUpdateRDY on the new NSQD instance.

mreiferson commented 2 years ago

LGTM, thanks