milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.9k stars 2.95k forks source link

[Feature]: hybrid search support choice for normalization #37477

Open yiwen92 opened 2 weeks ago

yiwen92 commented 2 weeks ago

Is there an existing issue for this?

Is your feature request related to a problem? Please describe.

Now in milvus, weighted ranker need to do normolization first, in order to combine different metric types from multi-recall ways. However, in some embedding models like bge-m3, mgte. They do not need this normolization and they can plus the distance directly.

img_v3_02gc_27622dac-29c6-4253-a339-80e02dea3cbg

Describe the solution you'd like.

Add a param in rerank to control whether need do normalization or not.

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

xiaofan-luan commented 2 weeks ago

Is there an existing issue for this?

  • [x] I have searched the existing issues

Is your feature request related to a problem? Please describe.

Now in milvus, weighted ranker need to do normolization first, in order to combine different metric types from multi-recall ways. However, in some embedding models like bge-m3, mgte. They do not need this normolization and they can plus the distance directly. img_v3_02gc_27622dac-29c6-4253-a339-80e02dea3cbg

Describe the solution you'd like.

Add a param in rerank to control whether need do normalization or not.

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

this means their vector distance is already normalized.

No matter what kind of normalization you do the result won't change.

There is not need to add extra complexity to the api