Ranking measure Issue in general_model.py forward_test()

awslabs / dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.

https://dglke.dgl.ai/doc/

Apache License 2.0

1.28k stars 196 forks source link

Ranking measure Issue in general_model.py forward_test() #110

Closed Zarca closed 4 years ago

Zarca commented 4 years ago

Hi, I found that in the package /python/dglke/models/general_models.py as refferenced below: https://github.com/awslabs/dgl-ke/blob/ff4da0b2a44c79e2ba14d55584a6e27591d0f276/python/dglke/models/general_models.py#L455

the line: rankings = F.sum(neg_scores >= pos_scores, dim=1) + 1 seems to exclude the positive sample itself out of the top 1 hit while it compare with itself , which will cause the top 1 hit is almost always near to 0% I suppose that the ">=" should be changed as ">" like below : rankings = F.sum(neg_scores > pos_scores, dim=1) + 1

Is that right?

classicsong commented 4 years ago

Turn off the --no_eval_filter flag. Then score of the positive triplets will be updated (adding the pos_socres) before doing ranking: https://github.com/awslabs/dgl-ke/blob/ff4da0b2a44c79e2ba14d55584a6e27591d0f276/python/dglke/models/general_models.py#L451

Zarca commented 4 years ago

@classicsong Hi,Thanks for your attention,

For my project ranking standard ,the finnal ranking result shouldn't exclude all the positive samples(including the positive itself ) that really exists, it means the neg_score should keep those really exisiting (h,r t) triplets.

So I have to keep the --no_eval_filter flag on, I also found that in this case, the top 1 hit is always 0 % if --no_eval_filter flag on, caused by the ">=" operation I mentioned just now, For instance,asumme that h_1+r_1 = t_1 is a positive triplet, while do ranking , the score of (h_1,r_1,t_1) equals the score of (h_1,r_1,t_1) so that the positive sample is now at the 2nd rank position , and the 1st rank one is also itself, but while we judgement it by: https://github.com/awslabs/dgl-ke/blob/ff4da0b2a44c79e2ba14d55584a6e27591d0f276/python/dglke/models/general_models.py#L462

the top 1 hit will always exclude itself even that it is indeed at the first place.

classicsong commented 4 years ago

If so, my suggestion is to fork this repo and make minor modification to the model and run the evaluation. Or you can write your own evaluation code. The training can generate the embedding no matter what evaluation strategy we are using. :)

Zarca commented 4 years ago

@classicsong Thanks for suggestions :) , so dose it means in the dgl-ke, if turn on the --no_eval_filter flag (which is default turned off) ,the top 1 % will always be zero? In my expriment on FB15K or FB15K-237, it seems this top 1 hit rank will be zero even after tens of epochs training in case where the --no_eval_filter turned on.

classicsong commented 4 years ago

@classicsong Thanks for suggestions :) , so dose it means in the dgl-ke, if turn on the --no_eval_filter flag (which is default turned off) ,the top 1 % will always be zero? In my expriment on FB15K or FB15K-237, it seems this top 1 hit rank will be zero even after tens of epochs training in case where the --no_eval_filter turned on.

If you do not set --neg_sample_size_eval N and use --no_eval_filter, The top 1% will always be zero as there are definitely positive pairs exists in the neg-sample. As FB15K is small we can eval it without setting --no_eval_filter flag. If it is Freebase, we suggest to use --no_eval_filter and --neg_sample_size_eval 10000

Zarca commented 4 years ago

Thanks for your kind reply,best regards.@classicsong