Closed WuNein closed 6 months ago
Thanks for your interesting and contribution to our works. Our primary focus is on solving sentence embedding tasks such as STS, which may not perform optimally for embedding passages in MTEB. The prompt we designed is for sentence embedding. You might find it helpful to refer to recent papers that use similar methods for passage embedding[1].
If you are interested in the performance of our method on MTEB, you can refer to the following table [2] (summarization in the table, but I think that performance of our method on STS may not be such lower)
Thank you for your PR. But I can't merge it, because our method is only for sentence embedding.
[1] Zhuang S, Ma X, Koopman B, et al. PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval[J]. arXiv preprint arXiv:2404.18424, 2024.
[2] Springer J M, Kotha S, Fried D, et al. Repetition Improves Language Model Embeddings[J]. arXiv preprint arXiv:2402.15449, 2024.
add mteb script for better eval