JulesGM / ParlAI_SearchEngine

A search engine for ParlAI's BlenderBot project (and probably other ones as well)
Creative Commons Attribution 4.0 International
132 stars 48 forks source link

Adding custom files with information to the search engine #5

Closed alelasantillan closed 3 years ago

alelasantillan commented 3 years ago

I've tried the search engine using your colab.ipynb and It worked perfect after several trials. It seemed that colab works better at some hours, whereas it fails when resources demand is high.

I am still wondering in which way can I add to your ParlAI_SearchEngine a custom directory with text files so the relevant documents can be found also in that directory. This way one can add custom information to the web search. Any idea?

JulesGM commented 3 years ago

This will not work, as the GoogleSearch search module queries the Google webpage, and parses the results from the resulting webpage. I don't see how you can add documents specifically for that. Maybe you can with either Google's or Bing's cloud services.

You could publish your documents on the web, and have Google search the webpage you published them on specifically with the site: search operator.

You could combine the results of the two by deriving the handler's class, and combining the results from the search with the site: operator and without.

JulesGM commented 3 years ago

In the paper they have hybrid dense retrieval + search engine solutions if I remember correctly. Maybe you can use that one to use the dense retriever on your documents.

alelasantillan commented 3 years ago

In the paper they have hybrid dense retrieval + search engine solutions if I remember correctly. Maybe you can use that one to use the dense retriever on your documents.

Do you refer to this paper: "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks"? Or Internet-Augmented Dialogue Generation?

JulesGM commented 3 years ago

Internet-Augmented Dialogue Generation