DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System
https://graphvite.io
Apache License 2.0
1.22k stars 151 forks source link

possible to seed with pretrained embeddings? #24

Closed goaaron closed 5 years ago

goaaron commented 5 years ago

Is there a simple way to seed a pretrained entity encoding before learning relation embeddings?

KiddoZhu commented 5 years ago

There is no straightforward interface for that. The easiest way is to manually initialize the relation embeddings, and then follow the fine-tune procedure.

Could you describe in what case you need such interface? If it is general, we will consider it in the update of our interface.

goaaron commented 5 years ago

Thanks for that! My inputs entities are effectively innumerable, so I want to be able to reason over substitutable embeddings rather than hard-coding an entity name/id in the triplets. I don't know how else I would go about that.

KiddoZhu commented 5 years ago

Ok. So you have all entity embeddings and you can't afford training them together. How large is that? You are going to split the large graph into small subsets, and you would like to train on each subset, with fixed entity embeddings for each subset and shared relation embeddings across subsets, right?

goaaron commented 5 years ago

Yes, I can't retrain those embeddings. I've embedded ~10million entities, but really am only concerned with a smaller subset on the order of 10k in the immediate. Eventually I would want to share those relations across the full graph.

KiddoZhu commented 5 years ago

Can I suppose that you only need to fix the entity embeddings, i.e. set lr to 0 for them? If so, we can make a separate learning rate multiplier for each kind of embeddings.

goaaron commented 5 years ago

I'm actually not too sure... I think I might want to be able to tune these embeddings with structural rules I put on after the fact--but unclear if it just makes sense to retrain full from scratch down the road when that's feasible. Either way I think being able to set the specific lr will be useful and the relation embeddings on top of relatively static entity embeddings will be helpful.

KiddoZhu commented 5 years ago

Ok. We'll consider separate initialization and lr for different embeddings.