langchain-ai / langchain-google

MIT License
98 stars 115 forks source link

Discussion: How to implement complex filters for BigQueryVectorStore #434

Closed Freezaa9 closed 1 week ago

Freezaa9 commented 1 month ago

Hi,

It is currently not possible to do complex filters using BigQueryVectorStore. By complex filters I mean using OR, >, >= ect ...

I don't know what kind of nice implementation can be done using dict for the filter arguments.

Should we go toward what's done for vector search by using namespace: https://python.langchain.com/v0.2/docs/integrations/vectorstores/google_vertex_ai_vector_search/ https://cloud.google.com/vertex-ai/docs/vector-search/filtering?hl=fr#json

What would be the best solution ?

lkuligin commented 1 month ago

namespaces are supported by the SDK

for BQ I'd rather follow the approach of PostgresVectorStore (pass filters that are already SQL clauses): https://github.com/googleapis/langchain-google-cloud-sql-pg-python/blob/main/src/langchain_google_cloud_sql_pg/vectorstore.py

@eliasecchig @kurtisvg any thoughts?

kurtisvg commented 1 month ago

Agreed that passing in filters is preferred -- allows for full leverage of the expressions in the language.

Freezaa9 commented 4 weeks ago

PR: https://github.com/langchain-ai/langchain-google/pull/448 Do you validate this type of implementation? If yes, I will continue by cleaning the code and adding test and documentation.

Freezaa9 commented 2 weeks ago

Hey, I have completed the PR and added test. Thanks in advance !