langchain-ai / langchain-postgres

LangChain abstractions backed by Postgres Backend
MIT License
133 stars 48 forks source link

Support for the sparse embeddings #71

Open magaton opened 5 months ago

magaton commented 5 months ago

The latest pgvector version supports sparsevec. However, langchain's PGVector supports only one embeddings column in langchain_pg_embedding table. It would be great to have a sparse_embedding column and sparse_embedding field in PGVector.

I have considered the alternative and that is to have 2 PGVector stores, 1 for dense and 1 for sparse vectors. However there are 2 problems with that:

gecBurton commented 5 months ago

hi @magaton I would be interested in collaborating on this, I would also like some kind of full-text/dense feature https://github.com/langchain-ai/langchain-postgres/issues/61

Freezaa9 commented 1 month ago

Hello, would be interested also.

But I think each vector DB should be separated. So for a hybrid search it would be

And then rerank by using EnsembleRetriever (for example: https://python.langchain.com/docs/how_to/ensemble_retriever/ )

To achieve this we should also bump the pgvector python version: #82