Closed johnsonjsyuen closed 1 year ago
Pinecone is a vector database, PostgreSQL is not specifically designed for handling high-dimensional vector data, but I assume you should be able to store vector embeddings as arrays or binary data. Searching and retrieving similar vectors won't be as efficient as using a dedicated vector database though. Extensions like pgvector or pgroonga are for that indeed, and if you don't want to mess up with the backend for upsert & query you can expose Postgres using let's say stored functions via PostgREST
https://supabase.com/docs/guides/database/extensions/pgvector definitely possible, although Pinecone is more specialized for the task.
Hi, @johnsonjsyuen! I'm Dosu, and I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you were asking if it is possible to use a different database system like Postgres or MySQL instead of Pinecone to reduce dependencies. sime2408 mentioned that while it is possible to store vector embeddings in Postgres using extensions like pgvector or pgroonga, it may not be as efficient as using a dedicated vector database. dalkommatt also confirmed that it is possible with Postgres, but Pinecone is more specialized for this task.
If this issue is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution, and please don't hesitate to reach out if you have any further questions or concerns!
Are there any updates regarding this topic?
Just wondering if it's possible to use something other than Pinecone, to reduce the dependencies. Postgres has extensions for vector storage https://github.com/pgvector/pgvector