Closed Bala93 closed 4 months ago
Hi, thanks for the great question and bringing the point that retrieving the relevant documents from corresponding dataset would further improve the performance! Currently for xRAGv1, the retrieval document is solely based on the wikipedia dump.
Thanks for the clarification.
Congratulations for the great work.
I have the following doubt: The reference document/representation retrieved is only based on the chunks from wikipedia or does it also include the document references from each of the datasets? For example, hotpotqa could benefit more from retrieving the relevant documents from their dataset compared to wikipedia dump. Am I missing something. ?
Thanks for your time.