Question about reader training

facebookresearch / DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Other

1.73k stars 304 forks source link

Hi @TZeng20 ,

positive and negative contexts for each question are generated before the training process. We use own DPR dense index to mine positives and negatives contexts (that is the point of open domain setup). We also use gold positive passages as positives provided by datasets and some extra heuristics regarding the choice of retrieved passages for the reader training.
The sample_batch contains a set of questions each accompanied with a pool of positive and hard negative passages. The inputs are what actually go into the model. create_reader_input() selects exactly one positive and a number of negative contexts from the provided pools, concatenates with the questions and tokenizes them converting to tensors.

Hope this clarifies your questions

facebookresearch / DPR