aditya-grover / node2vec

http://snap.stanford.edu/node2vec/
MIT License
2.61k stars 912 forks source link

The first walk path problem in spark-version #37

Open datazhen opened 6 years ago

datazhen commented 6 years ago

The first walk path of each nodes is unchanging after init in Node2Vec.initTransitionProb module.

 graph = Graph(indexedNodes, indexedEdges)
            .mapVertices[NodeAttr] { case (vertexId, clickNode) =>
              val (j, q) = GraphOps.setupAlias(clickNode.neighbors)
              val nextNodeIndex = GraphOps.drawAlias(j, q)
              clickNode.path = Array(vertexId, clickNode.neighbors(nextNodeIndex)._1)

              clickNode
            }

Therefor it result that the first path of a node from each walkers is same.


    for (iter <- 0 until config.numWalks) {
      var prevWalk: RDD[(Long, ArrayBuffer[Long])] = null
      var randomWalk = graph.vertices.map { case (nodeId, clickNode) =>
        val pathBuffer = new ArrayBuffer[Long]()
        pathBuffer.append(clickNode.path:_*)
        (nodeId, pathBuffer) 
      }.cache

In particular at the setting of directed=true, the edges is so sparse that it‘s hard to come back.

celia01 commented 5 years ago

There is random function in GraphOps.drawAlias(j, q), why the first walk path is unchanging?

wl142857 commented 5 years ago

I found this problem too. The first walk will not change after initialization.