monadic-xyz / gossip

Epidemic Broadcast Trees implementation using HyParView
BSD 3-Clause "New" or "Revised" License
6 stars 0 forks source link

Optimised datastructure for HyParView “views” #1

Open kim opened 6 years ago

kim commented 6 years ago

106 introduced HashSet instead of Set as the datastructure holding the “views” of the network from the POV of one node. The motivation was mainly the node identity type n needs to carry information about the (last known) network address of a peer, in order to be able to reconnect to it. Imposing an Ord constraint on n seemed at least contrived.

One issue now is that HashSets don’t track their size, making size O(n) (as opposed to O(1) for Set); see tibbe/unordered-containers#138. It‘s not a big problem, since the number of elements stored is fairly small, but less than optimal because size is used a lot.

Another not-so-nicety is that HashSet doesn’t have an indexing function (a la elemAt), requiring conversion to a list in order to select a random element (again occurs often in the algorithm).

Two approaches come to mind:

  1. Wrap HashSet such that it’s size is tracked (could be tricky with updates), and a separate Vector or some such is maintained for the purpose of drawing random elements.
  2. Use an IntMap underneath and explicitly convert from Hashable to Int (Achtung: collisions)
tsenart commented 6 years ago

Parent: #123

kim commented 6 years ago

An acceptable solution might also be stm-containers >= 1, which has a O(1) size operation, with random element selection through UnfoldlM.

Unfortunately, it will be a while until that version appears in lts, as it presumably breaks a lot of packages. In our codebase it breaks spock.

adinapoli-mndc commented 5 years ago

@kim Just realised that now we do have stm-containers >= 1 -- do you think we can crack on with this one, then?

kim commented 5 years ago

@adinapoli-mndc I am wondering if what we come up with for #6 could influence the choice of datastructure here. E.g. some sort of priority queue / heap might make sense, so if we move to stm-containers now, we would have to retrofit that on top.

adinapoli-mndc commented 5 years ago

That's a valid concern -- better tackle #6 first, and then let it guide the choice of the data structure here.