Multiple minimum scores returned by getTestBClassEmbeddingQuality()

IBM / e2r

Code for our NeurIPS 2019 paper "Quantum Embedding of Knowledge for Reasoning" (authors: Dinesh Garg, Shajith Ikbal, Santosh K Srivastava, Harit Vishwakarma, Hima Karanam, L Venkata Subramaniam) And Code for our NeurIPS 2020 paper "Inductive Quantum Embedding" (Authors: Santosh Srivastava, Dinesh Khandelwal, Dhiraj madan, Dinesh Garg, Hima Karanam, L Venkat Subramaniam)

Apache License 2.0

25 stars 14 forks source link

Multiple minimum scores returned by getTestBClassEmbeddingQuality() #2

Open Gori-LV opened 4 years ago

Gori-LV commented 4 years ago

Hi,

Many thanks for sharing this work, I have a question regarding fb15k experiment, i.e. for binary predicate.

It drew my attention during evaluation when I print the sorted score list returned by getTestBClassEmbeddingQuality(), I found that in many cases there exist multiple minimum scores for a single candidate list, all are 0.0's, and testTrueMember (also scored 0.0) is in candidateLst[0], thus numpy.argsort(scoreLst) and rank=numpy.where(rankLst==0)[0][0]+1 will always give the test member rank 1.

I notice that in evaluation module, for a single testing member (h, t), it generates two candidate lists that consist of 2-tuple's of entities by pairing the testing tail (w.r.t. testing head) with each of the rest entities. Thus in the function getBClassSpaceMembershipScore(), when calculating scores for each candidate by s=torch.sum(tmpH*tmpH, dim=1)+torch.sum(tmpT*tmpT, dim=1) either tmpH*tmpH or tmpT*tmpT is trivial because in a candidate list all tuples have the same tail or head. The ranking thus becomes comparing the distance from each entity to the concept space, there is no way to tell if the highest-rank pair is indeed a true member because the testing h or t does not play any role in this ranking, yet the testing member is always ranked 1 because it gives 0.0 score and it is in candidateLst[0].

My concern is, during training, the loss function only requires that given a relation (a binary concept), all head entities and tail entities lie in one subspace, without distinguishing heads and tails. Together with the above mentioned evaluation issue, any tuple in the candidate list with the paired entity lying in the concept subspace gives a minimum score, which brings the quoted issue. I tried to change the code to place testTrueMember to the last item of candidate list, it results in a decrease of Hit@K.

I wonder if you observed this issue in the experiment, or I made any mistake in this inference, please correct me. Thanks a lot!

gzupanda commented 4 years ago

Dr/Prof. Gori-LV, There are errors work with the datasets fb15k237 or wn18rr when I test few weeks ago. I wander why for these error?

发送自 Windows 10 版邮件应用

发件人: Gori-LV 发送时间: 2020年4月22日 16:33 收件人: IBM/e2r 抄送: Subscribed 主题: [IBM/e2r] Multiple minimum scores returned bygetTestBClassEmbeddingQuality() (#2)

Hi, Many thanks for sharing this work, I have a question regarding fb15k experiment, i.e. for binary predicate. It drew my attention during evaluation when I print the sorted score list returned by getTestBClassEmbeddingQuality(), I found that in many cases there exist multiple minimum scores for a single candidate list, all are 0.0's, and testTrueMember (also scored 0.0) is in candidateLst[0], thus numpy.argsort(scoreLst) and rank=numpy.where(rankLst==0)[0][0]+1 will always give the test member rank 1. I notice that in evaluation module, for a single testing member (h, t), it generates two candidate lists that consist of 2-tuple's of entities by pairing the testing tail (w.r.t. testing head) with each of the rest entities. Thus in the function getBClassSpaceMembershipScore(), when calculating scores for each candidate by s=torch.sum(tmpHtmpH, dim=1)+torch.sum(tmpTtmpT, dim=1) either tmpHtmpH or tmpTtmpT is trivial because in a candidate list all tuples have the same tail or head. The ranking thus becomes comparing the distance from each entity to the concept space, there is no way to tell if the highest-rank pair is indeed a true member because the testing h or t does not play any role in this ranking, yet the testing member is always ranked 1 because it gives 0.0 score and it is in candidateLst[0]. My concern is, during training, the loss function only requires that given a relation (a binary concept), all head entities and tail entities lie in one subspace, without distinguishing heads and tails. Together with the above mentioned evaluation issue, any tuple in the candidate list with the paired entity lying in the concept subspace gives a minimum score, which brings the quoted issue. I tried to change the code to place testTrueMember to the last item of candidate list, it results in a decrease of Hit@K. I wonder if you observed this issue in the experiment, or I made any mistake in this inference, please correct me. Thanks a lot! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Gori-LV commented 4 years ago

Dr/Prof. Gori-LV, There are errors work with the datasets fb15k237 or wn18rr when I test few weeks ago. I wander why for these error? 发送自 Windows 10 版邮件应用发件人: Gori-LV 发送时间: 2020年4月22日 16:33 收件人: IBM/e2r 抄送: Subscribed 主题: [IBM/e2r] Multiple minimum scores returned bygetTestBClassEmbeddingQuality() (#2) Hi, Many thanks for sharing this work, I have a question regarding fb15k experiment, i.e. for binary predicate. It drew my attention during evaluation when I print the sorted score list returned by getTestBClassEmbeddingQuality(), I found that in many cases there exist multiple minimum scores for a single candidate list, all are 0.0's, and testTrueMember (also scored 0.0) is in candidateLst[0], thus numpy.argsort(scoreLst) and rank=numpy.where(rankLst==0)[0][0]+1 will always give the test member rank 1. I notice that in evaluation module, for a single testing member (h, t), it generates two candidate lists that consist of 2-tuple's of entities by pairing the testing tail (w.r.t. testing head) with each of the rest entities. Thus in the function getBClassSpaceMembershipScore(), when calculating scores for each candidate by s=torch.sum(tmpHtmpH, dim=1)+torch.sum(tmpTtmpT, dim=1) either tmpHtmpH or tmpTtmpT is trivial because in a candidate list all tuples have the same tail or head. The ranking thus becomes comparing the distance from each entity to the concept space, there is no way to tell if the highest-rank pair is indeed a true member because the testing h or t does not play any role in this ranking, yet the testing member is always ranked 1 because it gives 0.0 score and it is in candidateLst[0]. My concern is, during training, the loss function only requires that given a relation (a binary concept), all head entities and tail entities lie in one subspace, without distinguishing heads and tails. Together with the above mentioned evaluation issue, any tuple in the candidate list with the paired entity lying in the concept subspace gives a minimum score, which brings the quoted issue. I tried to change the code to place testTrueMember to the last item of candidate list, it results in a decrease of Hit@K. I wonder if you observed this issue in the experiment, or I made any mistake in this inference, please correct me. Thanks a lot! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Hi, Just Gori please, I am a PhD student still in first year ;) I am not sure I understand your question, but if you run the code on fb15k-237, there should be no problem since the raw data structure is the same.

gzupanda commented 4 years ago

Gori, I’m also a Ph.D student, It is nice to know you. Thank you for you reply. I will run on this dataset later and write to you if any new problems.

发送自 Windows 10 版邮件应用

发件人: Gori-LV 发送时间: 2020年4月23日 15:46 收件人: IBM/e2r 抄送: gzupanda; Comment 主题: Re: [IBM/e2r] Multiple minimum scores returned bygetTestBClassEmbeddingQuality() (#2)

Dr/Prof. Gori-LV, There are errors work with the datasets fb15k237 or wn18rr when I test few weeks ago. I wander why for these error? 发送自 Windows 10 版邮件应用发件人: Gori-LV 发送时间: 2020年4月22日 16:33 收件人: IBM/e2r 抄送: Subscribed 主题: [IBM/e2r] Multiple minimum scores returned bygetTestBClassEmbeddingQuality() (#2) Hi, Many thanks for sharing this work, I have a question regarding fb15k experiment, i.e. for binary predicate. It drew my attention during evaluation when I print the sorted score list returned by getTestBClassEmbeddingQuality(), I found that in many cases there exist multiple minimum scores for a single candidate list, all are 0.0's, and testTrueMember (also scored 0.0) is in candidateLst[0], thus numpy.argsort(scoreLst) and rank=numpy.where(rankLst==0)[0][0]+1 will always give the test member rank 1. I notice that in evaluation module, for a single testing member (h, t), it generates two candidate lists that consist of 2-tuple's of entities by pairing the testing tail (w.r.t. testing head) with each of the rest entities. Thus in the function getBClassSpaceMembershipScore(), when calculating scores for each candidate by s=torch.sum(tmpHtmpH, dim=1)+torch.sum(tmpTtmpT, dim=1) either tmpHtmpH or tmpTtmpT is trivial because in a candidate list all tuples have the same tail or head. The ranking thus becomes comparing the distance from each entity to the concept space, there is no way to tell if the highest-rank pair is indeed a true member because the testing h or t does not play any role in this ranking, yet the testing member is always ranked 1 because it gives 0.0 score and it is in candidateLst[0]. My concern is, during training, the loss function only requires that given a relation (a binary concept), all head entities and tail entities lie in one subspace, without distinguishing heads and tails. Together with the above mentioned evaluation issue, any tuple in the candidate list with the paired entity lying in the concept subspace gives a minimum score, which brings the quoted issue. I tried to change the code to place testTrueMember to the last item of candidate list, it results in a decrease of Hit@K. I wonder if you observed this issue in the experiment, or I made any mistake in this inference, please correct me. Thanks a lot! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Hi, Just Gori please, I am a PhD student still in first year ;) I am not sure I understand your question, but if you run the code on fb15k-237, there should be no problem since the raw data structure is the same. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

shajithikbal commented 4 years ago

Hi Gori, Thank you for interest in our work. We will look into this and get back. Thank you very much again!

Gori-LV commented 4 years ago

Hi Gori, Thank you for interest in our work. We will look into this and get back. Thank you very much again!

Sure, thanks for your reply ;D

SsnAgo commented 2 years ago

Has the problem been solved?