Currently milvus goes as a pre-filter manner for filtered search, which means it will execute scalar filtering on segment data first and generate bitset to indicate each row is filtered or not, then knowhere vector engine will accept this bitset with query vector and execute filtered vector search and get final result.
If we get very complex filter condition, scalar filtering cost will dominate the overall search time, we propose post filter: vector search generates result continuously and execute scalar filtering on those data without executing scalar filtering on whole segment data, which could become a possibly better approach when filter selectivity is not too low.
Is there an existing issue for this?
What would you like to be added?
Currently milvus goes as a pre-filter manner for filtered search, which means it will execute scalar filtering on segment data first and generate bitset to indicate each row is filtered or not, then knowhere vector engine will accept this bitset with query vector and execute filtered vector search and get final result. If we get very complex filter condition, scalar filtering cost will dominate the overall search time, we propose post filter: vector search generates result continuously and execute scalar filtering on those data without executing scalar filtering on whole segment data, which could become a possibly better approach when filter selectivity is not too low.
Why is this needed?
No response
Anything else?
No response