redis-developer / redis-ai-resources

✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.
MIT License
155 stars 14 forks source link

Contribute advanced Hybrid search example in OpenAI Cookbook (python) #7

Open tylerhutcherson opened 1 year ago

tylerhutcherson commented 1 year ago

The existing cookbook just touches the surface: https://github.com/openai/openai-cookbook/blob/main/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb

Contribute a Python notebook that demonstrates complex Hybrid queries with Redis VSS and other search features (an ecommerce dataset might work nicely) including

michaelskyuan commented 1 year ago

Submitted PR

tylerhutcherson commented 1 year ago

Initial review submitted from our end >>> https://github.com/openai/openai-cookbook/pull/417

@michaelskyuan at some point we will also want to make an update to this notebook that covers bullet point 4 above. This is a bit "green field" in the sense that we have not yet explicitly tried this. But it's theoretically possible to do true weighted hybrid search using a redis pipeline command and "merging" results from the two scoring algorithms (BM25 + KNN/CosineD). I sense the lift will be a bit more on this, and since not immediately pressing, I will spin it off into a separate issue that we can re-prioritize when the time is right, probably in the next month.

michaelskyuan commented 1 year ago

I agree @tylerhutcherson. And I believe this topic deserves it's own separate notebook with a denser text dataset. Let's leave OOTB Redis Hybrid search functionality on the current notebook and have a specific notebook that will address normalization of lexical and semantic scoring using a more appropriate dataset instead of an ecommerce dataset.

Spartee commented 1 year ago

@michaelskyuan This App/notebook was recently contributed by OpenAI. We could use some of this.

ladrians commented 3 months ago

I am struggling trying to apply a hybrid filter similar to these examples, the difference I am using langchainjs 0.2.16 from typescript with the standard bindings, using RedisSearch 2.6 from the cloud.

I have the following metatadata associated to a chunk item

{\\\"name\\\"\\:\\\"20030331\\\",\\\"description\\\"\\:\\\"2003/03/31\\\",\\\"year\\\"\\:2003,\\\"month\\\"\\:3,\\\"day\\\"\\:31,\\\"date\\\"\\:20030331,\\\"id\\\"\\:\\\"5251cc41\\-307d\\-4117\\-9b0f\\-9e408eb37011\\\",\\\"doc_id\\\"\\:\\\"5251cc41\\-307d\\-4117\\-9b0f\\-9e408eb37011\\\"}

Would like to filter in a range for the year for example similar to the following:

@metadata:(\\\"year\\\"\\:[(2001 2004])

always returning 0 records. If I use exact match or the negaction the case works. I get started with

filterQuery = `@metadata:(-\\\"year\\\"\\:2001)`;
itemList = await client.ft.search(chunk, filterQuery);
console.log(filterQuery, itemList.total);

OK, 3 items returned

@metadata:(-\"year\"\:2001) 3

Then I started trying with a range

filterQuery = `@metadata:(\\\"year\\\"\\:\\[2003 2005\\])`;
itemList = await client.ft.search(chunk, filterQuery);
console.log(filterQuery, itemList.total);

should return elements buyt I get

@metadata:(\"year\"\:\[2003 2005\]) 0

Very similar to the previous one

filterQuery = `@metadata:(\\\"year\\\"\\:\\[\\(2003 2005\\])`;
itemList = await client.ft.search(chunk, filterQuery);
console.log(filterQuery, itemList.total);

should return elements too but still I get 0

@metadata:(\"year\"\:\[\(2003 2005\]) 0

Trying other filters and using And (&) worked fine:

filterQuery = `@metadata:(\\\"year\\\"\\:2003&\\\"date\\\"\\:20030331)`;
itemList = await client.ft.search(chunk, filterQuery);
console.log(filterQuery, itemList.total);

OK, 1 item filtered

@metadata:(\"year\"\:2003&\"date\"\:20030331) 1

Got some ideas from the referenced notebook but still cannot make it work. any idea?

thanks in advance!