izhx / uni-rep

Code for embedding and retrieval research.
MIT License
15 stars 0 forks source link
embeddings infomation-retrieval sentence-embeddings text-embedding

UniRep: code for embedding and retrieval research

This repository will contain code for reproducing different embedding and retrieval models, such as Dense retriever (on MSMARCO), Splade (sparse retriever), UnifieR (hybird retriever), and Udever (universal embedder for multiple natural and programing languages).

Udever

Language Models are Universal Embedders [Arxiv]

udever embedders are finetuned from bloom models via BitFit on MS MARCO Passage Ranking, SNLI and MultiNLI data. It is a universal embedding model across tasks, natural and programming languages. (From the technical view, udever is merely with some minor improvements to sgpt-bloom)

The code used to train these checkpoints can be downloaded at this Google Drive link.

Checkpoints / 模型权重

On HuggingFace:

On ModelScope / 魔搭社区: