milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.91k stars 2.95k forks source link

[Feature]: Implement a preprocessing and post processing module on proxy #27469

Open xiaofan-luan opened 1 year ago

xiaofan-luan commented 1 year ago

Is there an existing issue for this?

Is your feature request related to a problem? Please describe.

Description: The proxy component in Milvus plays a crucial role in handling client requests and managing data flow. Enhancing this component with preprocessing and postprocessing modules can significantly improve the convenience of data/result processing. Here's a detailed breakdown of the proposed implementation:

Preprocessing Module typical functions:

Normalization: Automatically normalize vector data to a standard form during insertion, ensuring consistent data format which is crucial for accurate similarity searches.

Dimension Reduction: Apply dimensionality reduction techniques to transform high-dimensional data into a lower-dimensional space, reducing storage requirements and potentially improving the efficiency of subsequent operations.

Scalar Transformation: Perform scalar transformations on vector data to conform to specific scales or ranges, aiding in more meaningful analysis and comparisons.

Postprocessing Module typical functions:

Search Result Handling: Develop functionalities to handle and manipulate search query results, such as sorting, filtering, or clustering, to provide more insightful and organized output to the users.

Ranking and Re-ranking: Implement ranking and re-ranking mechanisms to order the search results based on certain criteria, ensuring the most relevant results are presented at the top.

Modular and Extensible Architecture:

Design the modules in a modular and extensible manner, allowing for easy addition of new preprocessing and postprocessing operations in the future as per the community's needs.

User-Friendly Configuration:

Allow users to easily configure the preprocessing and postprocessing operations through a user-friendly interface, possibly integrating with the existing Milvus CLI or developing a dedicated interface. Comprehensive Documentation and Tutorials:

Create detailed documentation and tutorials demonstrating the usage and benefits of these modules, ensuring users can effectively leverage these new capabilities.

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

ArenaSu commented 7 months ago

@xiaofan-luan Hello,has this feature been implemented?

xiaofan-luan commented 7 months ago

This is under under discussion but not start yet.

@ArenaSu is you can give us more feedback about what is your use case that would be great helpful

ArenaSu commented 7 months ago

This is under under discussion but not start yet.

@ArenaSu is you can give us more feedback about what is your use case that would be great helpful

I just want to contribute to the community, but there are fewer features that include good-first-issue tag, so I would like to consult whether this feature is necessary.

xiaofan-luan commented 7 months ago

This is under under discussion but not start yet. @ArenaSu is you can give us more feedback about what is your use case that would be great helpful

I just want to contribute to the community, but there are fewer features that include good-first-issue tag, so I would like to consult whether this feature is necessary.

@congqixia could you help with @ArenaSu to find something to start with?

ArenaSu commented 7 months ago

And I prefer to develop in golang language.

congqixia commented 7 months ago

This is under under discussion but not start yet. @ArenaSu is you can give us more feedback about what is your use case that would be great helpful

I just want to contribute to the community, but there are fewer features that include good-first-issue tag, so I would like to consult whether this feature is necessary.

@congqixia could you help with @ArenaSu to find something to start with?

sure

@ArenaSu which parts are you interested in here?

ArenaSu commented 7 months ago

@congqixia My team may contribute most parts of Milvus. At the beginning I want to contribute to the features related to main business process.

ArenaSu commented 7 months ago

@congqixia Any part is ok for me such as proxy, root coord or data coord.

congqixia commented 7 months ago

@ArenaSu you could find me on discord and we could discuss this in detail~

ArenaSu commented 7 months ago

@congqixia My discord account is arenasu.

CaoHaiNam commented 1 week ago

@xiaofan-luan Hi, is this issue still needed? I prefer to develop in python3 language and have already completed Normalization feature.