UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.32k stars 2.48k forks source link

SGPT support #1590

Closed Muennighoff closed 9 months ago

Muennighoff commented 2 years ago

Hey, would you merge a PR adding support for SGPT? I think I would just need to add a pooling method to Pooling.py. Currently, using SGPT via the HF Inference API fails with __init__() got an unexpected keyword argument 'pooling_mode_weightedmean_tokens', because of the missing pooling method. I could also add it as a separate library to the HF Inf API, but I think it'd be much simpler to add the pooling method to this library.

nreimers commented 2 years ago

Yes, I would be happy to merge such a PR

sinaak commented 1 year ago

@Muennighoff @nreimers I still have the same issue, with Pytorch 1.12.2, cuda 10.2, and sentence-transformers 2.2.2. Im trying to use:

from sentence_transformers import SentenceTransformer model = SentenceTransformer("Muennighoff/SGPT-125M-weightedmean-nli-bitfit")

Muennighoff commented 1 year ago

@Muennighoff @nreimers I still have the same issue, with Pytorch 1.12.2, cuda 10.2, and sentence-transformers 2.2.2. Im trying to use:

from sentence_transformers import SentenceTransformer model = SentenceTransformer("Muennighoff/SGPT-125M-weightedmean-nli-bitfit")

There has been no new version released since the merge, so you need to install from source via pip install --upgrade git+https://github.com/UKPLab/sentence-transformers.git. See also here for more information :)

ReadyPlayerEmma commented 1 year ago

@nreimers Awesome work on this project. Would it be possible to make a release now that this is merged into the main branch now? It would be much appreciated to not have to install this from Git all the time. Perhaps there is some ongoing effort that has delayed a new release?

redcodebluecode commented 1 year ago

@nreimers I pip installed the newest version by using the commands shown below, and still having the same issue @Muennighoff mentioned earlier. I was trying code model = SentenceTransformer("Muennighoff/SGPT-125M-weightedmean-nli-bitfit") and I have an error message TypeError: __init__() got an unexpected keyword argument 'pooling_mode_weightedmean_tokens'. The pip commands I tried are listed below:

!pip install git+https://github.com/UKPLab/sentence-transformers.git
!pip install git+https://github.com/Muennighoff/sentence-transformers.git@sgpt_poolings_specb
!pip install --upgrade git+https://github.com/UKPLab/sentence-transformers.git
!pip install -U sentence-transformers
Muennighoff commented 1 year ago

@nreimers I pip installed the newest version by using the commands shown below, and still having the same issue @Muennighoff mentioned earlier. I was trying code model = SentenceTransformer("Muennighoff/SGPT-125M-weightedmean-nli-bitfit") and I have an error message TypeError: __init__() got an unexpected keyword argument 'pooling_mode_weightedmean_tokens'. The pip commands I tried are listed below:

!pip install git+https://github.com/UKPLab/sentence-transformers.git
!pip install git+https://github.com/Muennighoff/sentence-transformers.git@sgpt_poolings_specb
!pip install --upgrade git+https://github.com/UKPLab/sentence-transformers.git
!pip install -U sentence-transformers

Hmm can you try what's explained here: https://github.com/Muennighoff/sgpt/issues/14#issuecomment-1405205453

maheshpec commented 1 year ago

The code in pooling.py has been present in master branch since https://github.com/UKPLab/sentence-transformers/pull/1613 merged last September. Is there any help needed to release a version with the changes?

np-n commented 9 months ago

@Muennighoff, I got stuck on same issue. I fixed it by running your solution.

!pip install git+https://github.com/UKPLab/sentence-transformers.git
!pip install git+https://github.com/Muennighoff/sentence-transformers.git@sgpt_poolings_specb
!pip install --upgrade git+https://github.com/UKPLab/sentence-transformers.git
!pip install -U sentence-transformers

Thank you!

tomaarsen commented 9 months ago

Hello!

I intend to publish a new release soon, then this should be fully resolved without the workaround. Until then, feel free to use the workaround! I'll close this, as SGPT support has been added.

zubairahmed-ai commented 7 months ago

@tomaarsen still getting the error even after the latest update, any fix?

tomaarsen commented 7 months ago

Hello @zubairahmed-ai,

(Related: https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1/discussions/7)

I'm unable to reproduce this with the latest version. Are you confident that you're using the latest version? What happens if you run this code?

from sentence_transformers import SentenceTransformer, __version__ as sentence_transformers_version

print(sentence_transformers_version)
model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
embeddings = model.encode(["The weather is nice", "It is sunny today"])
print(embeddings.shape)

A possibility is that your updated sentence transformers is installed in a different Python installation than what you use to run your LanceDB code.

zubairahmed-ai commented 7 months ago

@tomaarsen

A possibility is that your updated sentence transformers is installed in a different Python installation than what you use to run your LanceDB code.

This was indeed the case, sorry for not updating, thanks for looking into it

tomaarsen commented 7 months ago

No worries! I'm glad you got it resolved :)