TimDettmers / ConvE

Convolutional 2D Knowledge Graph Embeddings resources
MIT License
675 stars 163 forks source link

Exporting the learned embeddings #16

Closed bdhingra closed 6 years ago

bdhingra commented 6 years ago

Hi,

Thanks for open-sourcing the code! I am interested in exporting the embeddings learned by ConvE to use with another task. Is there a straightforward way to export the mapping from entity / relation IDs to the learned embeddings?

Thanks, Bhuwan Dhingra

wiseodd commented 6 years ago

Not the author but I've been working with this code for quite a while.

The mapping of entity/relation -> index is inside the vocabulary file ~/.data/{dataset}/vocab_{e1, rel}. Then, you can load in PyTorch the model checkpoint from saved_models dir and take emb_e and emb_r weights. You can use the index to get the embeddings.

TimDettmers commented 6 years ago

Thank you @wiseodd, this is exactly the way you would do it.

More specifically, if you want to use the Pipeline class, you can use the Pipeline with the respective dataset and the respective variable to read in the vocabulary:

    input_keys = ['e1', 'rel', 'rel_eval', 'e2', 'e2_multi1', 'e2_multi2']
    p = Pipeline(Config.dataset, keys=input_keys)
    p.load_vocabs()

Now you can access the vocabulary class via vocab = p['vocab'][VARIABLE_NAME] and access the mapping vocab.token2idx and vocab.idx2token.

You can also directly read the vocab object from the respective file by using pickle:

token2idx, idx2token, label2idx, idx2label = pickle.load(open(path_to_vocab_file, 'rb'))

Please let me know if you have any more questions.