Evaluation Question - Githubissues

Good question! A complicated story, but here's a summary: there are two version of the annotated answers on NQ that have slight difference in postprocessing of HTML data. Without id2answers, it uses one of the versions used in DPR. With id2answers, it uses the other version from Google. I wrote the code to use id2answers to use the Google version. More details can be found in the last paragraph of README - result.

I found that the numbers over two versions are marginally different - usually less than 1%.

shmsw25 / AmbigQA

Evaluation Question #18