langchain-ai / langchain-postgres

LangChain abstractions backed by Postgres Backend
MIT License
66 stars 22 forks source link

Error during retrieval #31

Closed Sachin-Bhat closed 2 months ago

Sachin-Bhat commented 2 months ago

I am getting similar issues to #30 but it is related to retrieval. I used the old implementation of langchain-community vectorstore to do the indexing but during the retrieval I am using the langchain-postgres PGVector implementation. Maybe it could be that the schema of the db has changed. Wondering if there is a way to work around this.

Traceback (most recent call last):
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1971, in _exec_single_context
    self.dialect.do_execute(
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\default.py", line 919, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedColumn: column langchain_pg_embedding.id does not exist
LINE 1: SELECT langchain_pg_embedding.id AS langchain_pg_embedding_i...
               ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\src\rag_fusion\embeddings.py", line 28, in <module>
    asyncio.run(test())
  File "C:\Users\Sachin_Bhat\scoop\persist\rye\py\cpython@3.11.8\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\scoop\persist\rye\py\cpython@3.11.8\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\scoop\persist\rye\py\cpython@3.11.8\Lib\asyncio\base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\src\rag_fusion\embeddings.py", line 25, in test
    response = await retriever.aget_relevant_documents("lorem ipsum")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_core\retrievers.py", line 384, in aget_relevant_documents
    raise e
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_core\retrievers.py", line 377, in aget_relevant_documents
    result = await self._aget_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_core\vectorstores.py", line 716, in _aget_relevant_documents
    docs = await self.vectorstore.asimilarity_search(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_core\vectorstores.py", line 403, in asimilarity_search
    return await run_in_executor(None, self.similarity_search, query, k=k, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_core\runnables\config.py", line 514, in run_in_executor
    return await asyncio.get_running_loop().run_in_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\scoop\persist\rye\py\cpython@3.11.8\Lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_postgres\vectorstores.py", line 547, in similarity_search
    return self.similarity_search_by_vector(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_postgres\vectorstores.py", line 953, in similarity_search_by_vector
    docs_and_scores = self.similarity_search_with_score_by_vector(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_postgres\vectorstores.py", line 595, in similarity_search_with_score_by_vector
    results = self.__query_collection(embedding=embedding, k=k, filter=filter)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\langchain_postgres\vectorstores.py", line 931, in __query_collection
    .all()
     ^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\orm\query.py", line 2673, in all
    return self._iter().all()  # type: ignore
           ^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\orm\query.py", line 2827, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
                                                  ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 2306, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 2191, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\orm\context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1422, in execute
    return meth(
           ^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\sql\elements.py", line 514, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1644, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1850, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1990, in _exec_single_context
    self._handle_dbapi_exception(
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 2357, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1971, in _exec_single_context
    self.dialect.do_execute(
  File "C:\Users\Sachin_Bhat\Documents\dev\package\rag-fusion\.venv\Lib\site-packages\sqlalchemy\engine\default.py", line 919, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column langchain_pg_embedding.id does not exist
LINE 1: SELECT langchain_pg_embedding.id AS langchain_pg_embedding_i...
               ^

[SQL: SELECT langchain_pg_embedding.id AS langchain_pg_embedding_id, langchain_pg_embedding.collection_id AS langchain_pg_embedding_collection_id, langchain_pg_embedding.embedding AS langchain_pg_embedding_embedding, langchain_pg_embedding.document AS langchain_pg_embedding_document, langchain_pg_embedding.cmetadata AS langchain_pg_embedding_cmetadata, langchain_pg_embedding.embedding <=> %(embedding_1)s AS distance
FROM langchain_pg_embedding JOIN langchain_pg_collection ON langchain_pg_embedding.collection_id = langchain_pg_collection.uuid
WHERE langchain_pg_embedding.collection_id = %(collection_id_1)s::UUID ORDER BY distance ASC
 LIMIT %(param_1)s]
[parameters: {'embedding_1': '[-0.034000833,0.02298648,0.028531462,0.045393255,0.026792353,0.0124447085,0.023276329,-0.05630679,0.026212651,0.028758302,0.050030876,-0.013433984,0. ... (12440 characters truncated) ... 035916373,-0.014605992,-0.007996119,-0.032387745,-0.0018131468,-0.042066265,0.04504039,0.023805624,0.021259973,-0.033673175,0.028682688,-0.020919712]', 'collection_id_1': UUID('6de842c7-565c-42bf-964b-c55ecd578a96'), 'param_1': 4}]
(Background on this error at: https://sqlalche.me/e/20/f405)

Cheers, Sachin

eyurtsev commented 2 months ago

Closing this is a duplicate of: https://github.com/langchain-ai/langchain-postgres/issues/30