Right now the gossip manager takes a single partition to join to the cluster. At thousands of nodes this becomes very noisy with just gossip updates, and is an inefficient use of resources.
We should allow a single process to become multiple partitions and allow a gossipmanager to manage all of them like raft multigroup.
We can leverage Go's concurrency model to have a single process on 4 cores manage 4 partitions with 1 gossipmanager handling them all.
Thinking about this more, it shouldn't be too hard to make this change. Major changes:
Initialization of partitions
GossipManager needs to know about multiple partitions
GossipManager needs to send out info about multiple partitions for scanning topic lengths (array of these structs now)
Partitions should be aware of what other partitions on the same node has so they can favor the local partition when they receive a dequeue request and can't fulfill it
The GossipName cannot be the partition name anymore
We need to map remote partitions to node names so that when a node leaves we can get rid of all partitions it managed
Another point is heartbeating. If the gossip stops heartbeating we are losing multiple partitions.
Right now the gossip manager takes a single partition to join to the cluster. At thousands of nodes this becomes very noisy with just gossip updates, and is an inefficient use of resources.
We should allow a single process to become multiple partitions and allow a gossipmanager to manage all of them like raft multigroup.
We can leverage Go's concurrency model to have a single process on 4 cores manage 4 partitions with 1 gossipmanager handling them all.