McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.17k stars 88 forks source link

Does the instruction before question have any significance? #86

Closed aldrinjenson closed 4 months ago

aldrinjenson commented 4 months ago

Hi, I noticed that in the ReadMe example, there are instructions like "Given a web search query, retrieve relevant passages that answer the query:" added to each of the question as a prefix. I couldn't find any information on this in the paper as to why it is done.

Could you provide the use of doing it this way. Will skipping the instruction or even changing the wording of the instructions affect the result quality of the final embedding? How significant will it be?

vaibhavad commented 4 months ago

Hi @aldrinjenson,

Thanks for your interest in our work.

Instructions are necessary to differentiate between different tasks when using a single model, e.g. - retrieval, classification, sentence similarity etc. We mention the details of our setup in Section 3.2 in the paper. The list of all instructions used are given in Table 8 and Table 10.

In general, the field of instruction following encoders is a well explored field. The Instructor paper has more details about motivation to use instructions for encoding.

Let me know if you have any more questions.

aldrinjenson commented 4 months ago

Ahh. This makes sense. Thanks a lot for the clarification @vaibhavad!