Open bdhingra opened 3 years ago
Hi, I have maybe the same question here, in the paper (sec4.4) you said you removed some candidates for n-m relations, however, in the code, i did not find anything related to this. (i did not see anything special is done for n-m relations), did i miss anythign? Thanks!
@bdhingra @cloudygoose were you able to figure this out? I also have the same question
@ethanjperez Sadly no, I still believe the multi-target problem is not addressed in the released code. But, It's not hard to implement.
I have the same question. I also wonder what the "training data" means in the paragraph. I think the LAMA probe is used to detect whether the LMs can store the observed facts during training so I do not understand why we need to remove them. For multiple valid objects, I believe it is easy to implement but it needs additional annotation in the datasets.
The "Language Models as Knowledge Bases" paper, in Section 4.4, mentions that the evaluation deals with multiple valid objects for the same subject and relation pair. Specifically, valid objects other than the one which is being tested are removed from the ranked list of answers before computing the metrics. However, the TREx data released in this repository only includes one object per tested fact, even for queries where multiple valid answers do exist (based on my browsing of Wikidata). So, does the LAMA evaluation account of multiple valid objects? If yes, how does it do that given that the multiple objects are not in the data.