This pull request introduces a new quantization framework, including the MultiBitScalarQuantizer and OneBitScalarQuantizer, to provide efficient quantization methods for vectors in OpenSearch.
Key features:
Memory Efficiency: Designed the train method in MultiBitScalarQuantizer to compute mean and standard deviation in a single pass, reducing redundant memory allocations and improving efficiency.
Code Structure: Improved code readability and maintainability by reusing arrays for multiple calculations and employing a clear, modular approach.
OneBitScalarQuantizer
Training: Computes the mean of each dimension from sampled vectors, using these means as thresholds for quantization.
Quantization: Compares each dimension of the vector against the corresponding mean (threshold) to determine the quantized value, resulting in a binary representation (1 bit per dimension).
MultiBitScalarQuantizer
Training: Calculates mean, and standard deviation for each dimension from sampled vectors in a single pass. Determines multiple thresholds based on these bits per encoded in quantization.
Quantization: Uses the calculated thresholds to quantize the vector into multi-bit representations per dimension, allowing for more precise quantization compared to the one-bit approach.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Description
This pull request introduces a new quantization framework, including the MultiBitScalarQuantizer and OneBitScalarQuantizer, to provide efficient quantization methods for vectors in OpenSearch.
Key features:
OneBitScalarQuantizer
MultiBitScalarQuantizer
Related Issues
Resolves #1889
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.