facebookresearch / DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Other
1.71k stars 301 forks source link

Interactive mode #21

Closed andreamad8 closed 4 years ago

andreamad8 commented 4 years ago

Hi and thanks for the great repo.

I am wondering if is there an easy way to have an interactive script like DrQA.

For instance:

python scripts/retriever/interactive.py 

>>> process('question answering', k=5)

+------+-------------------------------+-----------+
| Rank |             Doc Id            | Doc Score |
+------+-------------------------------+-----------+
|  1   |       Question answering      |   327.89  |
|  2   |       Watson (computer)       |   217.26  |
|  3   |          Eric Nyberg          |   214.36  |
|  4   |   Social information seeking  |   212.63  |
|  5   | Language Computer Corporation |   184.64  |
+------+-------------------------------+-----------+

where the score is the dot product instead of the tf-idf score. Is there an easy way to have this? and is there a function that given the doc_id, or directly, return the corresponding document embedding?

I feel this can be of great use for the research community.

Thanks again for your great work

Andrea

vlad-karpukhin commented 4 years ago

Hi Andrea,

There is no interactive console in DPR at the moment. It is probably 1-2 days job to make something like that. One should combine question encoder, efficient index for wikipedia embeddings and the reader model into one interactive e2e pipeline. All subcomponents for this are ready, one just need to integrate them into a single component.

This is unfortunately a low pri task for me at the moment.

shmsw25 commented 4 years ago

Hi @andreamad8,

Not exactly same as interactive code that DrQA provides, but we provided a script that makes an end-to-end prediction with one line of command, as part of EfficientQA competition. Please see here for instructions. You can use it as an interactive system by having the input question as an input file. The code is comparable to the official DPR codes, and was written by a subset of DPR authors.

andreamad8 commented 4 years ago

Hi @vlad-karpukhin, thank you for the information and for the great repo. πŸ‘πŸ» I will have a deeper look at each module when I have some more time. I am currently just exploring some possible research direction, so I think I will go with the script mentioned by @shmsw25.

Hi @shmsw25, thanks a lot, this was exactly what I was looking for. I will try out the scripts 😊