Accenture / AmpliGraph

Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
Apache License 2.0
2.14k stars 251 forks source link

Loading a model from trained params function in EmbeddingModel for large KGs #169

Closed mhmgad closed 4 years ago

mhmgad commented 4 years ago

Description

when using function _load_model_from_trained_params() in module EmbeddingModel for large KGs in order to continue training, the function intialize the ent_emb matrix size to the batch_size , which according to intialization funtion and docs description it should be doable the batch_size. This leads to problem in _get_model_loss(self, dataset_iterator ) funtion.

    def _get_model_loss(self, dataset_iterator):
     ........
        if self.dealing_with_large_graphs:
            # Create a dependency to load the embeddings of the batch entities dynamically
            init_ent_emb_batch = self.ent_emb.assign(ent_emb_batch, use_locking=True)
            dependencies.append(init_ent_emb_batch)
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [42959,100] rhs shape= [85918,100]
         [[node Assign (defined at /home/gadelrab/.local/lib/python3.7/site-packages/ampligraph/latent_features/models/EmbeddingModel.py:514) ]]

Actual Implementation

def _load_model_from_trained_params(self):
    .......
        if not self.dealing_with_large_graphs:
            self.ent_emb = tf.Variable(self.trained_model_params[0], dtype=tf.float32)
        else:
            self.ent_emb_cpu = self.trained_model_params[0]
            self.ent_emb = tf.Variable(np.zeros((self.batch_size  , self.internal_k)), dtype=tf.float32)

        self.rel_emb = tf.Variable(self.trained_model_params[1], dtype=tf.float32)

Expected Implementation

def _load_model_from_trained_params(self):
    .......
        if not self.dealing_with_large_graphs:
            self.ent_emb = tf.Variable(self.trained_model_params[0], dtype=tf.float32)
        else:
            self.ent_emb_cpu = self.trained_model_params[0]
            self.ent_emb = tf.Variable(np.zeros((self.batch_size  * 2, self.internal_k)), dtype=tf.float32)

        self.rel_emb = tf.Variable(self.trained_model_params[1], dtype=tf.float32)
sumitpai commented 4 years ago

Hi @mhmgad. Currently we do not support the "continue training" option in both normal and large graph mode.

As you mentioned, the self.ent_emb can be declared as batch_size*2 in the evaluation mode. However, it is also okay to have it as batch size. Unlike training mode, we only corrupt one side at a time during evaluation. So we can only have a max of batch_size unique entities in the eval mode (since we prefetch the e_s, e_o and e_p for the test triple at the start, and only e_s_corr OR e_o_corr changes during corruption scoring).

mhmgad commented 4 years ago

Hi @sumitpai, Yes I have also thought that you may not need to double the size of the embedding during testing but it seems more convenient for code consistency. Specially this is how it is explained in the documentation

I faced this problem while extending EmbeddingModel module to support continuous learning which actually turned to be easy (thanks for your clean code). Therefore, I thought it may be useful for you to highlight it.

sumitpai commented 4 years ago

Yeah I guess we will update our code for consistency. We will fix this in our next release.