SeanLee97 / AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
https://arxiv.org/abs/2309.12871
MIT License
397 stars 30 forks source link

Incorporating Matryoshka Representation Learning #49

Closed talavivi03 closed 4 months ago

talavivi03 commented 4 months ago

I wanted to start by expressing my appreciation for your incredible model; its outstanding performance has significantly benefited my work, and for that, I am truly grateful.

I'm reaching out to inquire if you might consider incorporating Matryoshka Representation Learning into your model's training process. I believe that this technique could further amplify the model's capabilities and effectiveness, potentially boosting its performance even more.

Thank you for your time and for creating such a valuable tool.

SeanLee97 commented 4 months ago

@talavivi03 Thanks for following our work. We have integrated our new model (2DMSE: 2D Matryoshka Sentence Embeddings) in AnglE.

Welcome to use it :)

Paper: https://arxiv.org/abs/2402.14776 Code: https://github.com/SeanLee97/AnglE/blob/main/README_2DMSE.md

talavivi03 commented 4 months ago

Looks amazing! Are you planning to release the checkpoint as well?

SeanLee97 commented 4 months ago

We will release one 2DMSE model. Stay tuned:)

talavivi03 commented 4 months ago

That's awesome news! Looking forward to it!

SeanLee97 commented 4 months ago

hi @talavivi03 , mixedbread.ai released their 2d Matryoshka Embedding model, you can use it.

model: https://huggingface.co/mixedbread-ai/mxbai-embed-2d-large-v1 blogpost: https://www.mixedbread.ai/blog/mxbai-embed-2d-large-v1