[Bug]: InstructorEmbeddings results in error - "unexpected keyword argument 'token'"

RLinnae commented 4 months ago

Bug Description

I have tried multiple examples including https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html#huggingfaceembedding And https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html

Both result in the error:

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

Version

10.7

Steps to Reproduce

from llama_index.embeddings.instructor import InstructorEmbedding

embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")

Relevant Logs/Tracbacks

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-3ead615fe4c9> in <cell line: 1>()
----> 1 embed_model = InstructorEmbeddings(embed_batch_size=2, token=5)
      2 
      3 Settings.embed_model = embed_model
      4 Settings.chunk_size = 512
      5 

1 frames
/usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder, trust_remote_code, revision, token, use_auth_token)
    192 
    193             if is_sentence_transformer_model(model_name_or_path, token, cache_folder=cache_folder, revision=revision):
--> 194                 modules = self._load_sbert_model(
    195                     model_name_or_path,
    196                     token=token,

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

dosubot[bot] commented 4 months ago

🤖

Hey @RLinnae! Great to see you back in our digital neck of the woods. 🌈

Based on the information provided, it seems like there's a misunderstanding about the usage of the InstructorEmbedding class. The InstructorEmbedding class does not accept a token argument during initialization. The parameters it accepts are model_name, query_instruction, text_instruction, embed_batch_size, cache_folder, device, and callback_manager.

The error message you're seeing is likely due to the fact that you're trying to pass a token argument to the InstructorEmbedding class, which it does not recognize.

Here's how you should be initializing the InstructorEmbedding class:

from llama_index.embeddings.instructor import InstructorEmbedding

embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")

If you're trying to use a token for some other purpose (like authentication), could you please provide more details about what you're trying to achieve? That way, I can provide more targeted assistance.

For more information about the InstructorEmbedding class and its usage, you can refer to the source code.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

RLinnae commented 4 months ago

Note: the issue is with sentence transformers package 2.3+ , a quick fix was to use pip install sentence-transformers==2.2.2

dosubot[bot] commented 1 month ago

Hi, @RLinnae,

I'm helping the LlamaIndex team manage their backlog and am marking this issue as stale. From what I understand, you encountered a TypeError with the InstructorEmbedding due to an unexpected 'token' argument, but Dosubot provided a detailed response explaining that the InstructorEmbedding class does not accept a token argument during initialization and suggested the correct way to initialize the class. You later noted that the issue was with the sentence transformers package 2.3+ and a quick fix was to use pip install sentence-transformers==2.2.2.

Could you please confirm if this issue is still relevant to the latest version of the LlamaIndex repository? If it is, please let the LlamaIndex team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. If you have any further questions or concerns, feel free to reach out.

run-llama / llama_index