Closed bernhard2202 closed 5 years ago
Hi, that’s a good question. HotpotQA has 2 versions: distractor setting and full-wiki. Our article results are on full wiki and the results here are on distractor setting. I will update both results here soon. Hope this helps, Alon
Hi,
I am sorry if this is a stupid question. I have read the paper carefully and usually, the performance on the hotpot dataset is around 20 percent, or less when trained on other datasets. This repository mentions in ./models/README.md BERT would achieve 53 percent exact matches, and indeed downloading the HotpotQA data linked in this repository, converting it to squad format and training BERT results in a similar performance. Does the development set linked in this repository differ from the one used in the paper?
Thanks for your answer in advance.