shmsw25 / AmbigQA

An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"
https://arxiv.org/abs/2004.10645
117 stars 22 forks source link

Evaluation Question #18

Closed NoviScl closed 3 years ago

NoviScl commented 3 years ago

For evaluation on NQ, what exactly is id2answers? I noticed that you set self.data[i]["answer"] += id2answers[d["id"]] for training but self.data[i]["answer"] = id2answers[d["id"]] for evaluation, may I know what's the distinction?

Thanks.

shmsw25 commented 3 years ago

Good question! A complicated story, but here's a summary: there are two version of the annotated answers on NQ that have slight difference in postprocessing of HTML data. Without id2answers, it uses one of the versions used in DPR. With id2answers, it uses the other version from Google. I wrote the code to use id2answers to use the Google version. More details can be found in the last paragraph of README - result.

I found that the numbers over two versions are marginally different - usually less than 1%.