DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System
https://graphvite.io
Apache License 2.0
1.22k stars 151 forks source link

Could you explain what are vertex matrix and context matrix? #41

Open HenryYihengXu opened 4 years ago

HenryYihengXu commented 4 years ago

I'm reading your paper and have a question about section 3.2. Could you explain what are the vertex matrix and context matrix? Are they simply the source and the destination of edges?

KiddoZhu commented 4 years ago

Yes. Every node has a vertex and a context embedding. Vertex embedding is used when it acts as a source node, and context embedding is used for destination node.

HenryYihengXu commented 4 years ago

Are you saying each node will have two embedding vectors? So the number of embeddings will be twice as the number of nodes?

I don't quite understand the figure 2 in your Graphvite paper. In the figure, you have 4 GPUs and you divide the sample pool into 4x4 grids. Grid (i,j) contains edges with source in vertex i and dest in context j right? So what is the episode size here? Does it indicate that it takes many episodes to compute a single grid?

HenryYihengXu commented 4 years ago

And by your sampling method, each grid could contain different numbers of edges right?

KiddoZhu commented 4 years ago

For the embedding, yes.

The episode size is the number of batch in a grid, e.g. 500 for Youtube dataset. Each grid contains *episode size batch size** positive samples. The is mainly because a sufficiently large grid is needed to reduce swap of embedding parameters across GPUs.

razrLeLe commented 4 years ago

@KiddoZhu What's the purpose of setting two embedding vectors for each node?

KiddoZhu commented 4 years ago

One vector is called vertex representation, the other is called context representation. This is mainly borrowed from the distributional hypothesis in NLP.

Empirically it works better than tied vertex and context parameters.

HJW3536 commented 3 years ago

One vector is called vertex representation, the other is called context representation. This is mainly borrowed from the distributional hypothesis in NLP.

Empirically it works better than tied vertex and context parameters.

hi,when I use them, how do I choose vertex embeddings or context embeddings?

chi2liu commented 2 years ago

One vector is called vertex representation, the other is called context representation. This is mainly borrowed from the distributional hypothesis in NLP. Empirically it works better than tied vertex and context parameters.

hi,when I use them, how do I choose vertex embeddings or context embeddings?

vertex embeddings is better