Closed j6mes closed 6 years ago
@andreasvlachos just to confirm: I am thinking here we could test that at all pages from least one complete set of evidence is found to score a recall point The other option is to check if all pages from all sets of evidence are found for each claim in order to score a recall point.
I think option 1 would be more similar to how we plan to score the shared task - and would be a simpler task than option 2. What do you think?
Yes, option 1 is more sensible! On Tue, 12 Dec 2017 at 23:39, James Thorne notifications@github.com wrote:
@andreasvlachos https://github.com/andreasvlachos just to confirm: I am thinking here we could test that at all pages from least one complete set of evidence is found to score a recall point The other option is to check if all pages from all sets of evidence are found for each claim in order to score a recall point.
I think option 1 would be more similar to how we plan to score the shared task - and would be a simpler task than option 2. What do you think?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/sheffieldnlp/fever-baselines/issues/21#issuecomment-351231728, or mute the thread https://github.com/notifications/unsubscribe-auth/ABbUhcTcZyxqc-2HGTs-5impDEZCj4dMks5s_w6ZgaJpZM4Q_UTy .
Do not give any partial credit if only partial set of documents is returned.