chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
13.4k stars 1.14k forks source link

[Bug]: #2355

Open ZephryLiang opened 2 weeks ago

ZephryLiang commented 2 weeks ago

What happened?

mycode:

import os
from pathlib import Path
os.environ['HTTP_PROXY'] = 'http://127.0.0.1:7890'
os.environ['HTTPS_PROXY'] = 'http://127.0.0.1:7890'
os.environ["SENTENCE_TRANSFORMERS_HOME"] =Path.home().joinpath('embed_model', 'hkunlp/instructor-xl').as_uri()
import chromadb.utils.embedding_functions as embedding_functions
ef = embedding_functions.InstructorEmbeddingFunction(
model_name="hkunlp/instructor-xl", device="cuda")

error say:

File "/home/desir/PycharmProjects/pdf_parse/utlis/embed_func.py", line 9, in <module>
    ef = embedding_functions.InstructorEmbeddingFunction(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/desir/soft/anaconda3/envs/pdf_parse/lib/python3.11/site-packages/chromadb/utils/embedding_functions.py", line 363, in __init__
    self._model = INSTRUCTOR(model_name, device=device)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/desir/soft/anaconda3/envs/pdf_parse/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 197, in __init__
    modules = self._load_sbert_model(
              ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

but token is a keyword argu using in the function,so why?please!

Versions

chroma-hnswlib 0.7.3 chromadb 0.5.0 InstructorEmbedding 1.0.1

Relevant log output

No response

tazarov commented 1 week ago

@LiangZeFenglzf, thanks for reporting this issue. It appears that changes in sentence-transformer (v2.7.0+ that I have tested with) are breaking the Instructor embeddings lib.

ZephryLiang commented 1 week ago

@LiangZeFenglzf, thanks for reporting this issue. It appears that changes in sentence-transformer (v2.7.0+ that I have tested with) are breaking the Instructor embeddings lib.

you are so nice! how do you observe it?

tazarov commented 1 week ago

@LiangZeFenglzf, you can see the fix here - https://github.com/xlang-ai/instructor-embedding/pull/112/files. The project has implemented the change in SentenceTransformers lib but did not release a new version of the Python library.

ZephryLiang commented 1 week ago

how can i install the fixed version; i try 3 steps: pip uninstall package git clone package pip install -e ./package and then the package install at my project main folder,not site-package folder error say:

Traceback (most recent call last):
  File
"/home/desir/soft/anaconda3/envs/pdf_parse/lib/python3.11/site-packages/chromadb/utils/embedding_functions.py",
line 358, in __init__
    from InstructorEmbedding import INSTRUCTOR
ImportError: cannot import name 'INSTRUCTOR' from 'InstructorEmbedding'
(/home/desir/soft/anaconda3/envs/pdf_parse/lib/python3.11/site-packages/instructor-embedding/InstructorEmbedding/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/desir/PycharmProjects/pdf_parse/rag/embed_func/hl_func.py",
line 7, in <module>
    ef = embedding_functions.InstructorEmbeddingFunction(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File
"/home/desir/soft/anaconda3/envs/pdf_parse/lib/python3.11/site-packages/chromadb/utils/embedding_functions.py",
line 360, in __init__
    raise ValueError(
ValueError: The InstructorEmbedding python package is not installed. Please
install it with `pip install InstructorEmbedding`

Trayan Azarov @.***> 于2024年6月18日周二 13:47写道:

@LiangZeFenglzf https://github.com/LiangZeFenglzf, you can see the fix here - https://github.com/xlang-ai/instructor-embedding/pull/112/files. The project has implemented the change in SentenceTransformers lib but did not release a new version of the Python library.

— Reply to this email directly, view it on GitHub https://github.com/chroma-core/chroma/issues/2355#issuecomment-2175063682, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWFQFB4XQVTMQ5LER42V3GDZH7CX7AVCNFSM6AAAAABJNF66O6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZVGA3DGNRYGI . You are receiving this because you were mentioned.Message ID: @.***>

tazarov commented 1 week ago

@LiangZeFenglzf, I realized the same when I first tested it yesterday. Let me have a closer look at the repo.

ZephryLiang commented 1 week ago

No problem, take your time.

Trayan Azarov @.***> 于2024年6月19日周三 18:09写道:

@LiangZeFenglzf https://github.com/LiangZeFenglzf, I realized the same when I first tested it yesterday. Let me have a closer look at the repo.

— Reply to this email directly, view it on GitHub https://github.com/chroma-core/chroma/issues/2355#issuecomment-2178297819, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWFQFBYSVBWPWHJ5G5OYQBDZIFKHLAVCNFSM6AAAAABJNF66O6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZYGI4TOOBRHE . You are receiving this because you were mentioned.Message ID: @.***>