milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.89k stars 2.95k forks source link

[Enhancement]: Speed up Search Iterator and Cleanup SDKs #37548

Open PwzXxm opened 2 weeks ago

PwzXxm commented 2 weeks ago

Is there an existing issue for this?

What would you like to be added?

Speed up search iterators via using iterator directly rather than calling range search and maintain radius and range_filter on the SDK, which adds complexity to SDK implementations.

Why is this needed?

E2E Search Iterator becomes fundamental to workflows like client-side post filtering and others.

Anything else?

No response

PwzXxm commented 2 weeks ago

Stage 1 (planning to ship before 2.5.1)

  1. Replace the underlying Range Search Calls with Knowhere iterator Next() calls. For each SDK Next() call, construct Knowhere iterators at segment level and calling Next() on them. Do NOT hold iterators at this stage.

Stage 2

  1. Introduce SearchIteratorManager to hold iterators on each QN.
  2. Limit the max number of iterator it can hold to limit the overall memory usage, modified LRU eviction.
  3. Introduce iterator timeout settings and functionality.

Stage 3

  1. Manually close search iterator, new gRPC call.
  2. Calls sticky to delegators in multi-shard.
PwzXxm commented 2 weeks ago

/assign