Open XingzhiZhou opened 5 years ago
看完代码,我也发现这点。所以,我也有相同的疑问。能不能麻烦作者帮我们答疑一下,谢谢了。
Thanks for your question. Here the implementation is inspired by word2vec, in which the inner product of a pair of words within a window is optimized. If we directly optimize the relevance probability, the inner product terms in the denominator will be likely to decrease, which is not what we want since they are also ground truth edges in the graph.
In the paper, the update in G steps is according to formula 4, in which generative probability is defined by the product of several probability along the way from node v_c to node v. However, in the program, it only considers the nodes in the window to update generator. I am wandering whether this approximation is plausible?
If only using the sampled node without using the path over it, the efficiency would be quite low.