odissei-lifecourse / layered_walk

MIT License
0 stars 0 forks source link

clarify preconditions on input graph #18

Open f-hafner opened 3 weeks ago

f-hafner commented 3 weeks ago

the input graph needs to have certain properties, in particular, all nodes should belong to the same connected component. There may be more details; ask Dakota.

f-hafner commented 3 weeks ago

After talking to Dakota, the preconditions are:

  1. Define largest connected component. In NL, this is the family network of around 15.2 mio nodes
  2. Process edge list of each network:
    • for the network that has the largest connected component, keep all edges where at least one of the nodes is in the connected component. The intuition is: while the specific edge types are not directed, if A -> B are connected through family, there must be another edge B -> A through family too. [This may be hard to generalize to other settings; thus, it should be a special case]
    • for the remaining networks, keep only edges where both nodes are in the connected component
  3. Run walks starting from all nodes that are in the connected component.

This is also important for #10