langchain-ai / langchain-postgres

LangChain abstractions backed by Postgres Backend
MIT License
134 stars 48 forks source link

hybrid search #61

Open gecBurton opened 6 months ago

gecBurton commented 6 months ago

Not an "issue" I know, but would it be possible to have a hybrid full-text/vector search similar to https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/user-guide/fusion-search-use-guide?

gecBurton commented 5 months ago

any thoughts on this?

I am thinking something like:

query = "who were the leading figures in the french revolution?"

h = 1

index_vector = func.to_tsvector("english", vectorstore.EmbeddingStore.document)
search_vector = func.plainto_tsquery("english", " | ".join(query.split(" ")))
fulltext_search = func.ts_rank(index_vector, search_vector)

embedding = embedder.embed_query(query)
vector_search = vectorstore.distance_strategy(embedding)

results = session.query(
    vectorstore.EmbeddingStore,
    (vector_search * (1-h) + fulltext_search * h).label("distance")
).order_by(desc("distance"))

for doc, score in vectorstore._results_to_docs_and_scores(results):
    print(doc.page_content)

if this is of interest Ill raise a PR.