xlang-ai / instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Apache License 2.0
1.87k stars 135 forks source link

Model not loading #106

Open laylabitar opened 10 months ago

laylabitar commented 10 months ago

Hello,

I had a previous project where I was running a function with the instructor similarity to calculate the semantic similarity,

I come back to this project finding that I am unable to load the model with an error that I hadn't come across before:


TypeError Traceback (most recent call last) in <cell line: 2>() 1 from InstructorEmbedding import INSTRUCTOR ----> 2 model = INSTRUCTOR('hkunlp/instructor-base')

/usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py in init(self, model_name_or_path, modules, device, cache_folder, trust_remote_code, revision, token, use_auth_token) 192 193 if is_sentence_transformer_model(model_name_or_path, token, cache_folder=cache_folder, revision=revision): --> 194 modules = self._load_sbert_model( 195 model_name_or_path, 196 token=token,

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

I would appreciate the assistance on this issue!

laylabitar commented 10 months ago

Ah solved it, it was a simple dependency issue with the sentence transformers ( they updated a new version 9 hours ago) so just make sure to pip install sentence_transformers==2.2.2

racinmat commented 7 months ago

Same, supporting newer transformers would be great, because e.g. nomic needs at least transformers 2.3 because of the trust_remote_code.

rachithaiyappa commented 5 months ago

+1 Also, the resulting instruction embeddings using sentence-transformers=3.0.1 is different from using sentence-transformers=2.2.2 which makes sense but leaving it here for completeness.

BBC-Esq commented 3 months ago

The owners of this repository don't really update anymore, but you can try my fork here instead, which is compatible with the newest version of sentence-transformers:

https://github.com/BBC-Esq/instructor-embedding

racinmat commented 3 months ago

Or you can use my fork from #115 , which contains few more additions https://github.com/racinmat/instructor-embedding/tree/main

BBC-Esq commented 3 months ago

A fork of my fork! lol I like it. What else did you add?

racinmat commented 3 months ago

I reverted some renamings, so the class names are compatible with the version of instructor released on pypi and I added support for offline loading, so if the weights are present locally, it does not have to reach to the internet.