A question about evaluate.

hotpotqa / hotpot

Apache License 2.0

445 stars 75 forks source link

A question about evaluate. #25

Closed tomtang110 closed 5 years ago

tomtang110 commented 5 years ago

Hi I run my results in the eval() in hotpot_evaluate_v1.py, however, the result may be not the same with your scores in leaderboard. Could you tell me the correct function to evaluate? 屏幕快照 2019-07-07 14 35 18

qipeng commented 5 years ago

Hi, are you sure you're using our script on a complete output file (an example of predictions on the dev set can be found on the website)? Our script should print out a JSON object containing various metrics, and your output looks very different from it.