facebookresearch / PyTorch-BigGraph

Generate embeddings from large-scale graph-structured data.
https://torchbiggraph.readthedocs.io/
Other
3.37k stars 449 forks source link

Are node embeddings between graphs comparable/mapped to the same space? #153

Open mikelgjergjiuri opened 4 years ago

mikelgjergjiuri commented 4 years ago

We are attempting to produce node embeddings for several large graphs using PyTorch-BigGraph, and wish to cluster the nodes that are embedded in a downstream task.

If we train 1 graph, and then train another graph separately, am I correct in assuming that the node embeddings produced are not comparable to one another?

In other words, a node that is highly similar to a node in another graph (trained separately) does not necessarily lie near each other in space.

lw commented 4 years ago

Your understanding is correct. For one, the model is invariant by rotation of the embeddings, so even training the same graph twice until convergence may produce two models that are effectively identical (same scores, same angles between all pairs of embeddings) but rotated one wrt the other and thus the two embeddings of one given node may be arbitrarily different.

mikelgjergjiuri commented 4 years ago

Thank you for your reply,

So this would also apply even if I am evaluating on a new graph correct? (Load a saved model, learning rate set to 0 and run a few epochs on the new graph). The node embeddings only maintain proper distance to one another but do not maintain the same orientation that was used in the original graph.

lw commented 4 years ago

Not sure I follow. If you run a completely independent training run on a new different graph then, yes, the above also applies and you won't be able to compare individual embeddings with the ones of the other graph.

However, you said "load a saved model", and that could change things. If your new graph shares many entities with the old ones, you could take the checkpoint produced by training on the old graph, extract the embeddings of the entities they have in common, produce an "initial" checkpoint for the new graph (initializing new entities to a random vector) and use it to bootstrap training. If a few conditions apply (the graph has the same structure as the old one, there are only a few new entities, ...) and if you're a bit lucky, the old entities that you ported over will "anchor" the new model and it will end up converging to something comparable to the old one.