Order in target selection is not deterministic

CSHVienna / NetworkInequalities

Repository of the netin package to generate random scale-free graph generators proposed by members of the Network Inequality CSH group.

https://cshvienna.github.io/NetworkInequalities/

Other

14 stars 2 forks source link

Order in target selection is not deterministic #18

Closed mannbach closed 1 year ago

mannbach commented 1 year ago

In the network generation process, when a new node wants to link to an existing once, we collect potential target candidates together with the probability to connect to them. See, for instance, pah.get_target_probabilities. The return type is a list of probabilities and a set of target nodes.

The order over which a set iterated, is not defined and can vary from run to run, even when the set content is identical. We cannot be sure that the first probability in the list belongs to the nodes which appeared first in an iteration over the corresponding set. This can lead to assigning the wrong connection probability to nodes in random ways.

We should either

unify both in a dictionary or
additionally return an ordered list of candidate nodes or
remove the target set (typically, only the probabilities are extracted)

lisette-espin commented 1 year ago

I changed the methods get_target and get_target_probabilities in a way that they receive a list of available_nodes (those nodes who are already in the network - often obtained from graph.get_potential_nodes_to_connect) and a dictionary special_targets where the keys are node ids, and values anything (now a weight to bias the selection of those nodes).

The method get_target_probabilities returns probs an array of probabilities (all sum to 1) and each element corresponds to an element (a node) in available_nodes. This function also returns targets. In most cases, targets is the same as available_nodes. In other cases, like in TriadicClosure, it is a subset of triadic closure candidates.

Please, review the latest commit https://github.com/CSHVienna/NetworkInequalities/commit/a1efff0145c41b3f9c8dc87d19ff193452cf8dd8

mannbach commented 1 year ago

Thanks! For triadic closure, I would then call pah.get_target_probabilities with available_nodes set to the limited targets-selection returned by triadic_closure.get_target_probabilities, right?

lisette-espin commented 1 year ago

Sort of. If you see L187 in tc.get_target_probabilities, when triadic closure is uniform, the available_nodes are the ones from triadic closure candidates special_targets. Otherwise, it takes both, available_nodes and special_targets.