UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.38k stars 2.4k forks source link

Error when loading INSTRUCTOR model #2474

Open Fitmavincent opened 5 months ago

Fitmavincent commented 5 months ago

When I try to load INSTRUCTOR model:

INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

Error detail:

Traceback (most recent call last):
  File "D:\...\main.py", line 28, in <module>
    model = INSTRUCTOR('hkunlp/instructor-large')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\...\venv\Lib\site-packages\sentence_transformers\SentenceTransformer.py", line 194, in __init__
    modules = self._load_sbert_model(
              ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

I found this issued that raised before and it seems to be resolved by downgrading the sentence-transformers to 2.2.2; INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

However, this does not seem to work for me as I'm in Python 3.12.2. When I downgraded to 2.2.2, I got error of No module named distutils The code that imports distutils will no longer work from Python 3.12.

Python environment detail: Python 3.12.1 pip 24.0 torch 2.2.0 transformers 4.37.2

OS version: Win 10 NVCC version: Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0

tomaarsen commented 5 months ago

Hello!

INSTRUCTOR is indeed only compatible with Sentence Transformers 2.2.2, as it overrides some behind-the-scenes functionality that was updated in Sentence Transformers 2.3.0. Perhaps in the future I can try and support INSTRUCTOR models in Sentence Transformers directly.

Fitmavincent commented 5 months ago

Hmm... it sounds like there's no get-around in this one. Thanks for your prompt reply on this issue.

tomaarsen commented 5 months ago

My recommendation would be to revert to Sentence Transformers 2.2.2 & try and use an older Python version. I don't believe there's a better solution available at this time.