OpenMatch / Augmentation-Adapted-Retriever

[ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In".
MIT License
57 stars 5 forks source link

issue about results #2

Closed Loose-Gu closed 3 months ago

Loose-Gu commented 1 year ago

I try the AAR-Contriever checkpoint on the MSMARCO to augment the MMLU task, while the results aren't significantly better than the inital contriever-msmarco model. So, could you provide additional details regarding the prompts for incorporating retrieved documents into the LLM? I find that the paper doesn't mention it.

What's more, I find that the retriever can hardly find the useful documents for queries in STEM category. Would you like to give some examples about how AAR can augment such queries?

yuzc19 commented 1 year ago

Hi, thanks for the question. So which LM do you use for the retrieval augmentation? For the T5-style LM we just concatenate each retrieved document with the prompt template in appendix A.3 and directly feed it into the encoder. Then we use the FiD mechanism to incorporate all the document information. We conjecture that the performance gain may not be significant for the Contriever since the Contriever has been already intensively pre-trained on other data augmentations. We provide similar explanation in Section 5.2.

For the second question, if the LM's size is huge, I think some retriever (e.g., Contriever + InstructGPT) could not bring LMs performance gain since the LMs may already acquire related knowledge during their pre-training stage. So the research question in this case should be how to probe their inherent knowledge better instead of providing additional knowledge. However, I think using retrieved documents can be beneficial to the STEM category for relatively small language model (in our experiments, T5-Base, T5-Large and T5-XL). You could find the evidence in our main results.

FYI, this is the case of how AAR-Contriever provide useful documents for STEM:

Question: If the Moon is setting at noon the phase of the Moon must be

Choice_A: third quarter. Choice_B: waning crescent. Choice_C: waxing crescent. Choice_D: full.

Document: The phase is calledthe third quarter. The Moon is again 90 degrees...

Answer: A