kim commented 6 years ago

106 introduced `HashSet` instead of `Set` as the datastructure holding the “views” of the network from the POV of one node. The motivation was mainly the node identity type `n` needs to carry information about the (last known) network address of a peer, in order to be able to reconnect to it. Imposing an `Ord` constraint on `n` seemed at least contrived.

One issue now is that HashSets don’t track their size, making size O(n) (as opposed to O(1) for Set); see tibbe/unordered-containers#138. It‘s not a big problem, since the number of elements stored is fairly small, but less than optimal because size is used a lot.

Another not-so-nicety is that HashSet doesn’t have an indexing function (a la elemAt), requiring conversion to a list in order to select a random element (again occurs often in the algorithm).

Two approaches come to mind:

Wrap HashSet such that it’s size is tracked (could be tricky with updates), and a separate Vector or some such is maintained for the purpose of drawing random elements.
Use an IntMap underneath and explicitly convert from Hashable to Int (Achtung: collisions)

tsenart commented 6 years ago

Parent: #123

kim commented 6 years ago

An acceptable solution might also be stm-containers >= 1, which has a O(1) size operation, with random element selection through UnfoldlM.

Unfortunately, it will be a while until that version appears in lts, as it presumably breaks a lot of packages. In our codebase it breaks spock.

adinapoli-mndc commented 5 years ago

@kim Just realised that now we do have stm-containers >= 1 -- do you think we can crack on with this one, then?

kim commented 5 years ago

@adinapoli-mndc I am wondering if what we come up with for #6 could influence the choice of datastructure here. E.g. some sort of priority queue / heap might make sense, so if we move to stm-containers now, we would have to retrofit that on top.

adinapoli-mndc commented 5 years ago

That's a valid concern -- better tackle #6 first, and then let it guide the choice of the data structure here.

monadic-xyz / gossip

Optimised datastructure for HyParView “views” #1