milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.35k stars 2.82k forks source link

[Enhancement]: Optimize the retrieval operations for dynamic fields. #35514

Open czs007 opened 3 weeks ago

czs007 commented 3 weeks ago

Is there an existing issue for this?

What would you like to be added?

For collections with the dynamic field feature enabled, when retrieving data with dynamic fields specified in the output fields (which are not part of the static schema), the kernel returns all data from the dynamic field column. When there is a high concurrency of big top-k queries, the proxy faces significant pressure on both memory and downstream transmission bandwidth.

In light of the aforementioned situation, it is essential to modify the kernel to return only a portion of the data from the dynamic field, based on the specified dynamic field list, rather than the complete dataset.

Why is this needed?

Prevent proxy out-of-memory errors and reduce retrieval latency.

Anything else?

No response

xiaofan-luan commented 3 weeks ago

this is actually a great optimization, especially think of the network bandwidtch