McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.17k stars 88 forks source link

Instruction for corpus and query on MTEP Evaluation #104

Closed bzantium closed 2 months ago

bzantium commented 3 months ago

Do you use same instruction written in the paper for corpus and query on ReRank or Retrieval Benchmark?

vaibhavad commented 3 months ago

Hi @bzantium,

Thank you for your interest in our work. The corpus is encoding without any instruction following previous works (e5-mistal-instruct, echo-embeddings).

This is also demonstrated in the retrieval example.

Let me know if you have any further questions.