Muennighoff / sgpt

SGPT: GPT Sentence Embeddings for Semantic Search
https://arxiv.org/abs/2202.08904
MIT License
823 stars 51 forks source link

chinese support? #25

Open Lukangkang123 opened 1 year ago

Lukangkang123 commented 1 year ago

Does the model support Chinese input?

Muennighoff commented 1 year ago

Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining

Lukangkang123 commented 1 year ago

Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining

Thanks you very much! Do you mean this code?:https://github.com/Muennighoff/sgpt#asymmetric-semantic-search-be

Muennighoff commented 1 year ago

Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining

Thanks you very much! Do you mean this code?:https://github.com/Muennighoff/sgpt#asymmetric-semantic-search-be

Yeah you can use that code and swap the model for bigscience/sgpt-bloom-7b1-msmarco

Lukangkang123 commented 1 year ago

Sure; For asymmetric search (e.g. retrieval), I'd recommend https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco which has seen lots of Chinese during pretraining

Thanks you very much! Do you mean this code?:https://github.com/Muennighoff/sgpt#asymmetric-semantic-search-be

Yeah you can use that code and swap the model for bigscience/sgpt-bloom-7b1-msmarco

Thanks you!