Open xiaofan-luan opened 1 year ago
/assign
@xiaofan-luan could I ask if it is wrong to convert the embedding to a float32, which I think has better numerical performance on most CPU unless hardware support exists?
Or, is the purpose of this issue to support storage of such formats assuming that the compute nodes have the correct compute (e.g. GPU or the right Xeon chipset) to handle operations in those datatypes?
If so, do we need to implement fallback by e.g. emulation or casting when the appropriate compute support is missing? Pytorch handles by autocasting.
btw, bfloat16 does not exist on faiss: https://github.com/facebookresearch/faiss/wiki/How-to-make-Faiss-run-faster, and I believe not in Annoy or HNSWLib either
But it supports float16 and we can compile it back in: https://github.com/milvus-io/milvus/issues/2828
Welcome @jon-chuang , You can implement float16 first, we can discuss about bf16 later. 😄
/unassign @jon-chuang
We can break down the steps into the following:
Hi, I have a question that, does this issue(supporting float16 in Milvus) means all vector indices in Milvus will support float16 datatype, and using float16 can lead to significant decline in memory cost? We are using DiskANN now, and we hope to use float16 type in DiskANN.
Hi, I have a question that, does this issue(supporting float16 in Milvus) means all vector indices in Milvus will support float16 datatype, and using float16 can lead to significant decline in memory cost? We are using DiskANN now, and we hope to use float16 type in DiskANN.
For diskann, it is already under heavy quantization and use float16 won't help on reduce your memory cost.
Knowhere related issue: https://github.com/zilliztech/knowhere/issues/287
Faiss support for BF16 is getting closer:
Also, support binary vectors
Is there an existing issue for this?
Is your feature request related to a problem? Please describe.
There are many different vector types based on models. So far what we received most is double, float16, BF16, double and BF16 is on top priority. Anyone interested on it please help
Describe the solution you'd like.
No response
Describe an alternate solution.
No response
Anything else? (Additional Context)
No response