langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.43k stars 2.1k forks source link

Allow multiple metadata keys on RedisVectorStoreFilterType #5015

Closed mauriciocirelli closed 2 months ago

mauriciocirelli commented 6 months ago

Checked other resources

Example Code


// Add a sample document with some metadata fields:
vectorStore.addDocument(new Document({ pageContent: "Page Content!", metadata: { x: "y", z: 1, w: ["a", "b", "c"] } }))

// Narrow results with one metadata field:
vectorStore.similaritySearchWithScore("test string", 5, { x:  "y" }))

// Narrow results with two metadata fields (AND):
vectorStore.similaritySearchWithScore("test string", 5, { x:  "y", z: 1 }))

// Narrow results which contains an element on a metadata field list:
vectorStore.similaritySearchWithScore("test string", 5, { w: "a" }))

Error Message and Stack Trace (if applicable)

No response

Description

This is an issue created from the following discussion:

How to use RedisVectorStoreFilterType?

Current implementation of RedisVectorStore saves the Document's metadata object as a JSON string in the metadata-key field.

This makes it impossible to search for multiple fields.

In order to make the RedisVectorStore more compatible to the Document metadata object, we need to change how metadata fields are stored and filtered.

A use case for this is a vector store of books, which metadata with fields such as author (string), year (number) and tags (list of strings) would narrow the vector search and improve its results.

We can achieve this with MemoryVectorStore filter function, and I am pretty sure we can do the same with Redis.

I can work on a PR for this if this interests the community.

Thank you.

System Info

langchain@0.1.31 | MIT | deps: 17 | versions: 262

mauriciocirelli commented 6 months ago

Perhaps an easier (and yet more flexible) way to do the filtering is just passing a Redis search string in the filter:


vectorStore.similaritySearchWithScore("test string", 5, "@x:(y)")