milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.33k stars 2.91k forks source link

[Enhancement]: Optimize the retrieval operations for dynamic fields. #35514

Open czs007 opened 2 months ago

czs007 commented 2 months ago

Is there an existing issue for this?

What would you like to be added?

For collections with the dynamic field feature enabled, when retrieving data with dynamic fields specified in the output fields (which are not part of the static schema), the kernel returns all data from the dynamic field column. When there is a high concurrency of big top-k queries, the proxy faces significant pressure on both memory and downstream transmission bandwidth.

In light of the aforementioned situation, it is essential to modify the kernel to return only a portion of the data from the dynamic field, based on the specified dynamic field list, rather than the complete dataset.

Why is this needed?

Prevent proxy out-of-memory errors and reduce retrieval latency.

Anything else?

No response

xiaofan-luan commented 2 months ago

this is actually a great optimization, especially think of the network bandwidtch

stale[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.