milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.25k stars 2.9k forks source link

[Enhancement]: Post filter execution framework #37360

Open chasingegg opened 2 hours ago

chasingegg commented 2 hours ago

Is there an existing issue for this?

What would you like to be added?

Currently milvus goes as a pre-filter manner for filtered search, which means it will execute scalar filtering on segment data first and generate bitset to indicate each row is filtered or not, then knowhere vector engine will accept this bitset with query vector and execute filtered vector search and get final result. If we get very complex filter condition, scalar filtering cost will dominate the overall search time, we propose post filter: vector search generates result continuously and execute scalar filtering on those data without executing scalar filtering on whole segment data, which could become a possibly better approach when filter selectivity is not too low.

Why is this needed?

No response

Anything else?

No response

chasingegg commented 2 hours ago

/assign