[FEATURE] Lucene Inbuilt Scalar Quantizer to convert float 32 bits to 4 bits

opensearch-project / k-NN

🆕 Find the k-nearest neighbors (k-NN) for your vector data

https://opensearch.org/docs/latest/search-plugins/knn/index/

Apache License 2.0

156 stars 123 forks source link

[FEATURE] Lucene Inbuilt Scalar Quantizer to convert float 32 bits to 4 bits #2252

Open naveentatikonda opened 2 weeks ago

naveentatikonda commented 2 weeks ago

Description

Since OpenSearch 2.17 we have support for Lucene Inbuilt Scalar Quantizer which accepts fp32 vectors as input and dynamically quantizes the data into int7 ranging from [0 to 127] providing 4x compression. Adding support for 4 bits to the Lucene SQ provides 8x compression which helps to quantize fp32 vectors into int4 ranging from [0 to 15], which helps to further reduce the memory requirements by trading off recall.