MTEB Evaluation -BOS/EOS markers

ContextualAI / gritlm

Generative Representational Instruction Tuning

https://arxiv.org/abs/2402.09906

MIT License

538 stars 39 forks source link

MTEB Evaluation -BOS/EOS markers #33

Closed raghavlite closed 4 months ago

raghavlite commented 4 months ago

Hi, While evaluating on MTEB, are you not using user_bos, embed_bos and their eos versions?

Also, the default BASE_BOS, USER_BOS, Embed_BOS in run.py.

Are these to be used with all LLMs? I'm finetuning a mistral model which uses [INST] and [/INST] as instruction markers. How should I use these during training?

raghavlite commented 4 months ago

This line used during eval doesn't use the same format used during training here?

Muennighoff commented 4 months ago

the eval instruction is set here & should correspond to the training one https://github.com/ContextualAI/gritlm/blob/9883da1e77812e6ba2c107dc7b65d8c5ddc7396b/evaluation/eval_mteb.py#L1051

raghavlite commented 4 months ago

thanks