h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
http://h2o.ai
Apache License 2.0
10.94k stars 1.2k forks source link

Getting "an unexpected keyword argument 'cache_folder'" during import #1713

Open CyberBearSec opened 1 week ago

CyberBearSec commented 1 week ago

I have the following code into which I am passing in a JSON document. It keeps throwing the same error. I checked the JSOn and it is valid - what am I doing wrong?

Error: TypeError: SentenceTransformer.init() got an unexpected keyword argument 'cache_folder'

I am using Chroma through LangChain.

```
db_directory = os.path.join(user_directory, database_name + ".db")
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
# Cosine will keep the similarity scores between zero and one
chroma_db = Chroma(persist_directory=db_directory, collection_name=collection_name, 
embedding_function=embedding_function,
                       collection_metadata={"hnsw:space": "cosine"}, relevance_score_fn=lambda distance: 1.0 - distance / 2)
json_splitter = RecursiveJsonSplitter(max_chunk_size=2000)
docs = json_splitter.create_documents(json_splitter.split_json(json_object))
if doc_ids is None:
    doc_ids = [str(uuid.uuid4()) for i in range(1, len(docs) + 1)]
else:
    # We look to see if the document exists:
    result = chroma_db.get(doc_ids)
    if result is not None and len(result) > 0:
        # This is an update:
        chroma_db.update_documents(doc_ids, docs)
        return doc_ids
chroma_db.from_documents(docs, embedding_function, ids=doc_ids)
return doc_ids