google-research-datasets / natural-questions

Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is designed for the training and evaluation of automatic question answering systems.
Apache License 2.0
938 stars 153 forks source link

clarification of 'score' #6

Open filbertphang opened 5 years ago

filbertphang commented 5 years ago

Hello, in 'nq_eval.py' it is mentioned that "Each prediction should be provided with a long answer score, and a short answers score".

May I clarify what these scores refer to? Are these scores supposed to represent the confidence of the model's predictions, or is there a fixed method to obtain scores?

For example, can we define the 'score' to simply be the sum of the start and end logits of the prediction?

Lastly, are scores also required for null predictions?

Thank you very much!

sgondala commented 4 years ago

Did you figure out how to implement this? I'm stuck on the same issue.