otiliastr / coper

Contextual Parameter Generation for Knowledge Graph Link Prediction
22 stars 7 forks source link

The result of convE on the FB15k-237 is different from original paper. #3

Closed quqxui closed 3 years ago

quqxui commented 3 years ago

Hi , i found the result of convE on the FB15k-237 is 60.83(Hits@10) in the Table 1 of your paper, However the result is 0.501 in the original paper: 'Convolutional 2D Knowledge Graph Embeddings'.

Is there anything I ignored? If so, please point it out.

best.

gstoica27 commented 3 years ago

Hi,

All our reported scores for ConvE came from our own training and evaluation of the model in our code-base. Within this, we believe the primary reason for our observed performance improvements lies in our negative sampling approach described in Appendix A, which the original ConvE did not use.

Hope this helps!

quqxui commented 3 years ago

I read the two papers mentioned in Appendix A, but I don't find any brilliant negative sampling method. Do you mean 'filter' and 'raw' ?

Also, i read your code, the 1-N score function in 'convE' needs to generate reverse relation for every one relation. But i don't find it in your code. So, i am thinking do you only use the tail node for sampling and ignore the head node?

If i miss anything, please point it out. And tell me where can i find it in the code and papers. Thank you very much!!

otiliastr commented 3 years ago

Hi,

In the supplementary, Appendix A, we describe the difference between the way we select negative sample vs how the original ConvE did. The idea is that during training for the original ConvE, for each question (e_s, r, ?), the model makes predictions for all possible e_t. If a triple (e_s, r, e_t) is missing from the training graph, the model is supervised to predict this as a "0" (or negative sample). But some of these triples are not true negatives, they are just missing (and may even appear as positive samples in the test dataset), but this training strategy encourages the model predict "0" for these. Instead we subsample at random a subset of triples that we treat as negatives. More discussion in Appendix A here.

The implementation for selecting training batches is in the function "train_dataset" in "data.py", here. The negative sampling part is done here.

Regarding the inverse relations, we have a flag which lets us decide whether to use the inverses or not. When we create the tfrecords files, we mark each sample with a boolean flag specifying if the sample "is_inverse" (see here). Then, when we generate training batches in the "train_dataset" function, we have the option to keep the inverses or not based on the flag "include_inv_relations" which is by default True.