[Feature]: Clustering optimization

Is there an existing issue for this?

[X] I have searched the existing issues

Is your feature request related to a problem? Please describe.

Umbrella issue for clustering key optimization for milvus.

In the realm of database management, maximizing the efficiency of data storage and retrieval is of utmost importance. A clustering key stands out as a crucial element in database design, guiding the physical storage arrangement based on the distribution of data within a table. In conventional database systems, the usual data distribution revolves around the minimum and maximum values of scalar fields. However, in the case of a vector database, vectors take precedence as our primary entities. Consequently, in Milvus, we're committed to supporting both scalar clustering keys and vector clustering keys.

Key change: 1, Support designating a scalar or vector field as the clustering key for a collection. 2, Enabling bulk insert data with specific clustering information. Milvus will organize the data based on the provided clustering information. 3, Filtering out irrelevant data during searches based on clustering information. 4, Implementing a feature in Milvus to compact collections with a clustering key, leading to a rearrangement of storage.

Phase 1: Support bulk insert and query data with clustering info

Tasks:

[x] Define clustering proto https://github.com/milvus-io/milvus-proto/pull/227
[ ] Support bulk insert with clustering meta #29444
[ ] Support search optimization based on clustering meta #29444
[x] Add distance calculate #28656
[ ] Support L2 segment #29595
[ ] Forbid compact clustered data
[ ] SDK Support
[ ] E2E Performance report
[x] Define clustering Key proto https://github.com/milvus-io/milvus-proto/pull/235
[ ] Support create clustering key collection #29506
[ ] Compatible with Multi vector

Phase 2: Clustering based compaction

Dependency:

[ ] L0 delete/compaction
[ ] Milvus-storage V2 integration
[ ] Compaction V2 refactoring (weak dependency)

Tasks:

[ ] Clustering compaction strategy
[ ] Clustering compaction schedule
[ ] Clustering compaction execution
[ ] E2E test

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

milvus-io / milvus