Open kunwuz opened 4 years ago
My understanding is that the model and embeddings are saved in model folder default, and updated every epoch. Theoretically you can replace it and coax the algorithm to pick up from there to continue the training. I believe the code is in the downstream task part of the document
My understanding is that the model and embeddings are saved in model folder default, and updated every epoch. Theoretically you can replace it and coax the algorithm to pick up from there to continue the training. I believe the code is in the downstream task part of the document
Thanks for your timely reply! If my understanding of the document is correct, are the relation embeddings stored in the operator's state dict?
# Load the operator's state dict with h5py.File("model/fb15k/model.v50.h5", "r") as hf: operator_state_dict = { "real": torch.from_numpy(hf["model/relations/0/operator/rhs/real"][...]), "imag": torch.from_numpy(hf["model/relations/0/operator/rhs/imag"][...]), } operator = ComplexDiagonalDynamicOperator(400, dynamic_rel_count) operator.load_state_dict(operator_state_dict) comparator = DotComparator()
If so, is there any way to set an initial embedding for those relations? Like the way PBG did in the 'featurized entity'.
Hi @kunwuz . @dany-nonstop is correct about where the relation embeddings "live" in checkpoints. We didn't design PBG to make it easy to set initial relation embeddings, and I'm not exactly sure what the use case for this is. If you just want the simplest/hacky change to accomplish this, I would suggest something like just adding a couple lines here
https://github.com/facebookresearch/PyTorch-BigGraph/blob/master/torchbiggraph/train_cpu.py#L440
e.g.
for relation_idx, operator in enumerate(model.rhs_operators): # maybe lhs_operators too depending on what you're running
relation_params = list(operator.parameters()) # this should be one or two tensors depending on which operator it is
relation_params[0].copy_(initial_relation_params[relation_idx])
Details about the names of all the subfields inside of model
can be found here:
https://github.com/facebookresearch/PyTorch-BigGraph/blob/master/torchbiggraph/model.py#L775
P.S. This is not what "featurized entity" does. Featurized entities means that you represent an entity as the average of embeddings of a list of "features" that represents an entity, e.g. you could have an embedding for each word, and then represent a "wikipedia page" entity as the average of all the words in that page.
It's important for some Knowledge Graph research to obtain the relation embedding. For example, in some cases, we need to use some initial pre-trained relation embeddings instead of learning it from zero. However, I haven't found that in the instruction & code yet. Could you please give me some instructions about how to explicitly obtain the relation embedding?