Closed goaaron closed 5 years ago
There is no straightforward interface for that. The easiest way is to manually initialize the relation embeddings, and then follow the fine-tune procedure.
Could you describe in what case you need such interface? If it is general, we will consider it in the update of our interface.
Thanks for that! My inputs entities are effectively innumerable, so I want to be able to reason over substitutable embeddings rather than hard-coding an entity name/id in the triplets. I don't know how else I would go about that.
Ok. So you have all entity embeddings and you can't afford training them together. How large is that? You are going to split the large graph into small subsets, and you would like to train on each subset, with fixed entity embeddings for each subset and shared relation embeddings across subsets, right?
Yes, I can't retrain those embeddings. I've embedded ~10million entities, but really am only concerned with a smaller subset on the order of 10k in the immediate. Eventually I would want to share those relations across the full graph.
Can I suppose that you only need to fix the entity embeddings, i.e. set lr to 0 for them? If so, we can make a separate learning rate multiplier for each kind of embeddings.
I'm actually not too sure... I think I might want to be able to tune these embeddings with structural rules I put on after the fact--but unclear if it just makes sense to retrain full from scratch down the road when that's feasible. Either way I think being able to set the specific lr will be useful and the relation embeddings on top of relatively static entity embeddings will be helpful.
Ok. We'll consider separate initialization and lr for different embeddings.
Is there a simple way to seed a pretrained entity encoding before learning relation embeddings?