dice-group / Convolutional-Complex-Knowledge-Graph-Embeddings

19 stars 2 forks source link

IndexError at testing #5

Closed abdullahfathi closed 3 years ago

abdullahfathi commented 3 years ago

Hello, I am using your model and it works fine on a dataset but when I wnat to try it on another dataset, I got this error

2021-01-05 20:49:09,619 - ConEx_0_target - INFO - Cuda available:False
2021-01-05 20:49:09,984 - ConEx_0_target - INFO - Info pertaining to dataset:{'dataset': 'KGs/target/', 'dataset_augmentation': True, 'train_plus_valid': True, 'tail_pred_constraint': False}
2021-01-05 20:49:09,984 - ConEx_0_target - INFO - Number of triples in training data:10235176
2021-01-05 20:49:09,984 - ConEx_0_target - INFO - Number of triples in validation data:0
2021-01-05 20:49:09,984 - ConEx_0_target - INFO - Number of triples in testing data:2558794
2021-01-05 20:49:09,984 - ConEx_0_target - INFO - Number of entities:947084
2021-01-05 20:49:09,985 - ConEx_0_target - INFO - Number of relations:4020
2021-01-05 20:49:09,985 - ConEx_0_target - INFO - HyperParameter Settings:{'batch_size': 1024, 'decay_rate': None, 'embedding_dim': 400, 'feature_map_dropout': 0.4, 'hidden_dropout': 0.3, 'input_dropout': 0.3, 'kernel_size': 3, 'label_smoothing': 0.1, 'learning_rate': 0.001, 'num_of_epochs': 2000, 'num_of_output_channels': 16, 'num_workers': 4, 'scoring_technique': 'KvsAll', 'train_plus_valid': True, 'norm_flag': False, 'num_entities': 947084, 'num_relations': 4020, 'dataset': 'KGs/target/', 'dataset_augmentation': True, 'tail_pred_constraint': False}
2021-01-05 20:49:21,536 - ConEx_0_target - INFO - ConEx starts training
2021-01-05 20:49:21,536 - ConEx_0_target - INFO - 'Number of free parameters: 781368992
2021-01-05 20:49:21,536 - ConEx_0_target - INFO - k_vs_all_training_schema starts
Traceback (most recent call last):
  File "search.py", line 38, in <module>
    experiment.train_and_eval()
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/util/experiment.py", line 175, in train_and_eval
    model = self.k_vs_all_training_schema(model)
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/util/experiment.py", line 282, in k_vs_all_training_schema
    loss = model.forward_head_and_loss(e1_idx, r_idx, targets)
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/models/complex_models.py", line 161, in forward_head_and_loss
    return self.loss(self.forward_head_batch(e1_idx=e1_idx, rel_idx=rel_idx), targets)
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/models/complex_models.py", line 135, in forward_head_batch
    emb_rel_real = self.bn_rel_real(self.emb_rel_real(rel_idx))
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

Any idea? Thanks

Demirrr commented 3 years ago

Hello @abdullahfathi ,

The index error appears to occurs in self.emb_rel_real(rel_idx)). This implies that self.emb_rel_real(.) received an index that is larger than initially given parameter. I would suggest you to catch this error and verify whether this assumption holds. I would expect that you would catch an index that is greater thanINFO - Number of relations:4020.

Finally, I would suggest you to ensure the both datasets have the same formats.

abdullahfathi commented 3 years ago

I printed the length of rel_idx by adding this before print(" rel_idx= ", len(rel_idx)) and I got this

2021-01-07 20:11:54,272 - ConEx_0_target - INFO - Loss at 5.th epoch:1.1014062364120036
2021-01-07 20:11:54,427 - ConEx_0_target - INFO - Standard Link Prediction evaluation on Testing Data
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
rel_idx =  1024
Traceback (most recent call last):
  File "search.py", line 38, in <module>
    experiment.train_and_eval()
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/util/experiment.py", line 195, in train_and_eval
    self.evaluate_one_to_n(model, self.dataset.test_data, 'Standard Link Prediction evaluation on Testing Data')
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/util/experiment.py", line 97, in evaluate_one_to_n
    predictions = model.forward_head_batch(e1_idx=e1_idx, rel_idx=r_idx)
  File "/home/a/afaahmed/profiles/unix/cs/nellie/100datasetexp/Convolutional-Complex-Knowledge-Graph-Embeddings/models/complex_models.py", line 136, in forward_head_batch
    emb_rel_real = self.bn_rel_real(self.emb_rel_real(rel_idx))
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/upb/users/a/afaahmed/profiles/unix/cs/miniconda3/envs/quat/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

Please note that the size of relation INFO - Number of relations:2054 I hope I am doing that correctly. Thanks

Demirrr commented 3 years ago

rel_idx is a tensor holding the indexes of relations in the current mini-batch. Consequently, the length of rel_idxmust be indeed 1024 since the batch size was set to 1024 (see the text above HyperParameter Settings:{'batch_size': 1024, ...).

As stated in my previous message, you need to find the index of the relation causing the IndexError.

I would suggest you the following steps:

  1. Ensure that your datasets are in the required format. Given that the IndexError occurs during testing, I would suggest you to focus on the testing dataset.
  2. Find the index of the relation causing this IndexError by using simple catching the exception.
  3. Find the respective relation in the index mapping after you find the index of the relation in the step 2.
  4. Finally, consider the following computation intializing the model
self.entity_idxs = {self.dataset.entities[i]: i for i in range(len(self.dataset.entities))}
self.relation_idxs = {self.dataset.relations[i]: i for i in range(len(self.dataset.relations))}
self.kwargs.update({'num_entities': len(self.entity_idxs),'num_relations': len(self.relation_idxs)})
...
ConEx(self.kwargs)
  1. Later in ConEx(self.kwargs), we use these values as shown below
self.num_entities = params['num_entities']
self.num_relations = params['num_relations']
...
self.emb_rel_real = nn.Embedding(self.param['num_relations'], self.embedding_dim)  # real
...

I would assume that the index error stems from the fact that rel_idx contains a value that is greater than self.param['num_relations'].