langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.94k stars 14.91k forks source link

PGVector - AttributeError: 'numpy.ndarray' object has no attribute 'embed_documents' #19504

Closed raulalhenacare closed 2 months ago

raulalhenacare commented 6 months ago

Checked other resources

Example Code

from langchain_community.document_loaders import TextLoader from langchain.text_splitter import RecursiveCharacterTextSplitter from sentence_transformers import SentenceTransformer from langchain_community.vectorstores.pgvector import PGVector

CONNECTION_STRING = 'postgresql+psycopg2://user:pass@localhost:5432/test_vector' COLLECTION_NAME= 'art_of_war'

model = SentenceTransformer("all-MiniLM-L6-v2")

loader = TextLoader('./art_of_war.txt', encoding='utf-8') documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=80) texts = text_splitter.split_documents(documents)

normalized_texts = [] for text in texts: normalized_texts.append(text.page_content)

embeddings = model.encode(normalized_texts)

db = PGVector.from_documents(embedding=embeddings, documents=texts, collection_name=COLLECTION_NAME, connection_string=CONNECTION_STRING)

Error Message and Stack Trace (if applicable)

Traceback (most recent call last): File "/Users/raul/Documents/ai/test_vectors/app.py", line 25, in db = PGVector.from_documents(embedding=embeddings, documents=texts, collection_name=COLLECTION_NAME, connection_string=CONNECTION_STRING) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/raul/.pyenv/versions/3.11.8/lib/python3.11/site-packages/langchain_community/vectorstores/pgvector.py", line 1105, in from_documents return cls.from_texts( ^^^^^^^^^^^^^^^ File "/Users/raul/.pyenv/versions/3.11.8/lib/python3.11/site-packages/langchain_community/vectorstores/pgvector.py", line 975, in from_texts embeddings = embedding.embed_documents(list(texts)) ^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'numpy.ndarray' object has no attribute 'embed_documents'

Description

I'm trying to use PGVector to connect with Postgres and save the embeddings

System Info

langchain==0.1.13 langchain-cli==0.0.21 langchain-community==0.0.29 langchain-core==0.1.33 langchain-openai==0.1.1 langchain-text-splitters==0.0.1 MacOs Sonoma 14.4 (23E214) M2 Python 3.11.8

zhoukewei9700 commented 6 months ago

The embedding in PGVector.from_documents should be a embeddings object. If you want to use sentence transformer embedding, the code might be like this:

from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
db = PGVector.from_documents(embedding=embeddings, documents=texts, collection_name=COLLECTION_NAME, connection_string=CONNECTION_STRING)