snap-stanford / snap

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.
Other
2.16k stars 797 forks source link

node2vec adds extra node "0" #161

Open NoahAmsel opened 5 years ago

NoahAmsel commented 5 years ago

Take the Karate graph example. If you check the input snap/examples/node2vec/graph/karate.edgelist, there are 34 vertices, numbered 1 - 34. Now check the output, snap/examples/node2vec/emb/karate.emb; there are 35 vectors numbered 0 - 34. At least change the numbering in the example input so users know to include a node #0

Wang-Yu-Qing commented 4 years ago

Met same problem too.

arodriguezca commented 4 years ago

I think you shouldn't include node 0. In the Karate example, they used the directed option "-dr", therefore, sometimes the random walk will be truncated by the lack of out neighbors (e.g. see node 11). I think that their implementation adds this node 0, which is an alias for a "super" sink node. If you add "-ow" as an argument, you'll see that all the random walks finish at 0. (As expected, this doesn't happen when you treat the graph as undirected and you'll get embeddings from 1 to 34).

theoren commented 4 years ago

A possible fix: Update snap-adv/biasedrandomwalk.cpp Rename SimulateWalk() -> SimulateWalk() and add void SimulateWalk(PWNet& InNet, int64 StartNId, const int& WalkLen, TRnd& Rnd, TIntV& WalkV) { SimulateWalk(InNet, StartNId, WalkLen, Rnd, WalkV); if (1) { // stationary walk int64 Dst = WalkV.Last(); while (WalkV.Len() < WalkLen) { WalkV.Add(Dst); } } }