Open LouisDo2108 opened 3 months ago
For the first question, I think the answer actually depends on the experimental setup in your work. I reviewed our experimental setup and IncDSI's, and they look similar but are indeed different, in IncDSI they consider corresponding query-docid pairs for new document sets $D{t>1}$ available, but we consider them not. We argue that in a realistic scenario when a search engine indexes new documents there is no user feedback yet and thus no labeled data. In fact, in terms of the KILT dataset, query-docid pairs for $D{t>1}$ are available as well, but we did not bother to use them. Anyway, I think the answer to this question may depend on your experimental setup, if you have the same experimental setup as IncDSI then you use the labeled data directly, if you have the same experimental setup as ours then you need to construct pseudo query-docid pairs.
For the second question, I think pseudo-queries can still be constructed by leveraging ISS followed by sampling an 𝑛-gram span, please refer to tasks/qa/generate.py
. Furthermore, since we focus on downstream multi-task scenarios, we take more efficiency into consideration when constructing pseudo-queries. With even less regard for efficiency, I think constructing pseudo-queries using docTTTTTquery
is also an option (refer to DSI++ or IncDSI), and we actually tested it and achieved good results.
Dear @Sherlock-coder ,
Thank you for sharing your implementation and the awesome paper. I have some questions regarding the single task scenario (only consider Open-domain QA), where the setup is similar to IncDSI
Best regards, Louis.