milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.62k stars 2.92k forks source link

should we just convert FP32 to fp16 automatically? #37448

Open liliu-z opened 1 week ago

liliu-z commented 1 week ago

should we just convert FP32 to fp16 automatically? @smellthemoon @liliu-z @tedxu any thoughts on this?

This is worth debating: If yes, means we modify the data from users' side and store it. Some operations like GetVector's meaning will get hurt. If no, SDK user experience, especially RESTFUL API will suffer

They define their vector as BF16/FP16 already. and there is no easy way to represent BF16/FP16 on most the languages. If data lose accuracy it's user's choice.

If they want to keep original data they should use Fp32 float + Floa16/BF16 quantazition or int8 quantazation in the future

Originally posted by @xiaofan-luan in https://github.com/milvus-io/milvus/discussions/37123#discussioncomment-11146861

liliu-z commented 1 week ago

/assign @cqy123456

xiaofan-luan commented 1 week ago

I think this definitely makes user's understanding easier