Is your feature request related to a problem? Please describe.
Umbrella issue for clustering key optimization for milvus.
In the realm of database management, maximizing the efficiency of data storage and retrieval is of utmost importance. A clustering key stands out as a crucial element in database design, guiding the physical storage arrangement based on the distribution of data within a table. In conventional database systems, the usual data distribution revolves around the minimum and maximum values of scalar fields. However, in the case of a vector database, vectors take precedence as our primary entities. Consequently, in Milvus, we're committed to supporting both scalar clustering keys and vector clustering keys.
Key change:
1, Support designating a scalar or vector field as the clustering key for a collection.
2, Enabling bulk insert data with specific clustering information. Milvus will organize the data based on the provided clustering information.
3, Filtering out irrelevant data during searches based on clustering information.
4, Implementing a feature in Milvus to compact collections with a clustering key, leading to a rearrangement of storage.
Phase 1: Support bulk insert and query data with clustering info
Is there an existing issue for this?
Is your feature request related to a problem? Please describe.
Umbrella issue for clustering key optimization for milvus.
In the realm of database management, maximizing the efficiency of data storage and retrieval is of utmost importance. A clustering key stands out as a crucial element in database design, guiding the physical storage arrangement based on the distribution of data within a table. In conventional database systems, the usual data distribution revolves around the minimum and maximum values of scalar fields. However, in the case of a vector database, vectors take precedence as our primary entities. Consequently, in Milvus, we're committed to supporting both scalar clustering keys and vector clustering keys.
Key change: 1, Support designating a scalar or vector field as the clustering key for a collection. 2, Enabling bulk insert data with specific clustering information. Milvus will organize the data based on the provided clustering information. 3, Filtering out irrelevant data during searches based on clustering information. 4, Implementing a feature in Milvus to compact collections with a clustering key, leading to a rearrangement of storage.
Phase 1: Support bulk insert and query data with clustering info
Tasks:
Phase 2: Clustering based compaction
Dependency:
Tasks:
Describe the solution you'd like.
No response
Describe an alternate solution.
No response
Anything else? (Additional Context)
No response